Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alfresco artifacts owned by root #328

Open
Fikili opened this issue Apr 23, 2022 · 9 comments
Open

Alfresco artifacts owned by root #328

Fikili opened this issue Apr 23, 2022 · 9 comments

Comments

@Fikili
Copy link
Contributor

Fikili commented Apr 23, 2022

Bug description

ACS 7.2 installation started using $ ansible-playbook playbooks/acs.yml -i inventory_ssh.yml
Some files are owned by root and therefore cannot be used by alfresco user. Here are examples:

  • Alfresco startup fails because alfresco.war and share.war are owned by root instead of alfresco user.
  • ActiveMQ folders owned by root
    • f.e. /opt/apache-activemq-5.16.4/bin
  • .ansible_alfresco_components.status not accessible

FYI, control node running on WSL2 with Ubuntu 20.

Workaround:
  • $ sudo chown alfresco:alfresco /opt/alfresco/content-services-7.2.0/web-server/webapps/*.war
  • $ sudo chown -R alfresco:alfresco /opt/apache-activemq-5.16.4/
  • $ sudo chown -R alfresco:alfresco /opt/apache-tomcat-9.0.59/
  • $ sudo chmod 666 /opt/alfresco/.ansible_alfresco_components.status

Target OS

RHEL 8.3

Ansible error

ActiveMQ:

RUNNING HANDLER [../roles/activemq : restart-activemq] ******
fatal: [activemq_1]: FAILED! => {"changed": false, "msg": "Unable to start service activemq: Job for activemq.service failed because the control process exited with error code.\nSee \"systemctl status activemq.service\" and \"journalctl -xe\" for details.\n"}

Repository:

RUNNING HANDLER [../roles/repository : restart-alfresco-content] *****
fatal: [repository_1]: FAILED! => {"changed": false, "msg": "Unable to start service alfresco-content: Job for alfresco-content.service failed because the control process exited with error code.\nSee \"systemctl status alfresco-content.service\" and \"journalctl -xe\" for details.\n"}

Ansible context

ansible --version
(alfresco-ansible) alfresco@XXX:~/git/alfresco-ansible-deployment$ ansible --version
ansible [core 2.12.4]
  config file = /home/alfresco/git/alfresco-ansible-deployment/ansible.cfg
  configured module search path = ['/home/alfresco/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/alfresco/git/alfresco-ansible-deployment/alfresco-ansible/lib/python3.8/site-packages/ansible
  ansible collection location = /home/alfresco/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/alfresco/git/alfresco-ansible-deployment/alfresco-ansible/bin/ansible
  python version = 3.8.10 (default, Mar 15 2022, 12:22:08) [GCC 9.4.0]
  jinja version = 3.1.1
  libyaml = True
ansible-config dump --only-changed
(alfresco-ansible) alfresco@XXX:~/git/alfresco-ansible-deployment$ ansible-config dump --only-changed
ANSIBLE_PIPELINING(/home/alfresco/git/alfresco-ansible-deployment/ansible.cfg) = True
ansible-inventory -i your_inventory_file --graph
(alfresco-ansible) alfresco@XXX:~/git/alfresco-ansible-deployment$ ansible-inventory -i inventory_ssh.yml --graph
@all:
  |--@activemq:
  |  |--activemq_1
  |--@adw:
  |  |--adw_1
  |--@database:
  |  |--database_1
  |--@external:
  |  |--@external_activemq:
  |--@external_activemq:
  |--@nginx:
  |  |--nginx_1
  |--@repository:
  |  |--repository_1
  |--@search:
  |  |--search_1
  |--@syncservice:
  |  |--syncservice_1
  |--@transformers:
  |  |--transformers_1
  |--@ungrouped:
@alxgomz
Copy link
Contributor

alxgomz commented Apr 23, 2022

Hi @Fikili ,

That's an odd issue you're reporting. There's no reason I know off where the war file belonging to root could break the startup of the catalina process (as long as the war is readable to the alfresco user)... Actually we test the playbook as part of the CI against ubuntu20 EC2 instances and never witnessed that. To be sure this is not something to do with non-ec2 ubuntu20 I've spinned up a local VM and installed ubuntu 20.04 on it.
I've run the playbook, and everything worked. The war file indeed belong to root but that's not causing an issue when it comes to starting up. Please see bellow permissions I have after the system is successfully installed:

 ls -l /opt/alfresco/content-services-7.2.0/web-server/webapps/*.war
-rw-rw-r-- 1 alfresco alfresco    275031 Mar 19 09:32 /opt/alfresco/content-services-7.2.0/web-server/webapps/ROOT.war
-rw-rw-r-- 1 alfresco alfresco    649827 Mar 19 09:32 /opt/alfresco/content-services-7.2.0/web-server/webapps/_vti_bin.war
-rw-r--r-- 1 root     root     206523411 Apr 23 13:56 /opt/alfresco/content-services-7.2.0/web-server/webapps/alfresco.war
-rw-r--r-- 1 alfresco alfresco   1207734 Apr 23 13:32 /opt/alfresco/content-services-7.2.0/web-server/webapps/api-explorer.war
-rw-r--r-- 1 root     root      96425960 Apr 23 13:57 /opt/alfresco/content-services-7.2.0/web-server/webapps/share.war
ls -l /opt/apache-activemq-5.16.4/
total 18112
-rw-r--r-- 1 alfresco alfresco    40581 Jan 31 01:02 LICENSE
-rw-r--r-- 1 alfresco alfresco     3334 Jan 31 01:02 NOTICE
-rw-r--r-- 1 alfresco alfresco     2611 Jan 31 01:02 README.txt
-rwxr-xr-x 1 alfresco alfresco 18471406 Jan 31 01:02 activemq-all-5.16.4.jar
drwxr-xr-x 5 root     root         4096 Apr 23 12:39 bin
drwxr-xr-x 2 alfresco alfresco     4096 Apr 23 12:39 docs
drwxr-xr-x 7 alfresco alfresco     4096 Jan 31 01:02 examples
drwxr-xr-x 6 alfresco alfresco     4096 Apr 23 12:39 lib
drwxr-xr-x 6 alfresco alfresco     4096 Apr 23 12:39 webapps
drwxr-xr-x 3 root     root         4096 Apr 23 12:39 webapps-demo
ls -l /opt/apache-tomcat-9.0.59/
total 128
-rw-r----- 1 alfresco alfresco 18980 Feb 21 21:01 BUILDING.txt
-rw-r----- 1 alfresco alfresco  6210 Feb 21 21:01 CONTRIBUTING.md
-rw-r----- 1 alfresco alfresco 57092 Feb 21 21:01 LICENSE
-rw-r----- 1 alfresco alfresco  2333 Feb 21 21:01 NOTICE
-rw-r----- 1 alfresco alfresco  3378 Feb 21 21:01 README.md
-rw-r----- 1 alfresco alfresco  6898 Feb 21 21:01 RELEASE-NOTES
-rw-r----- 1 alfresco alfresco 16507 Feb 21 21:01 RUNNING.txt
drwxr-x--- 2 alfresco alfresco  4096 Apr 23 13:31 bin
drwxr-x--- 2 alfresco alfresco  4096 Apr 23 13:31 lib
ls -l /opt/alfresco/.ansible_alfresco_components.status
-rw-r--r-- 1 root root 864 Apr 23 14:02 /opt/alfresco/.ansible_alfresco_components.status

I'm wondering if that could be linked to the control node running on wsl2...? or maybe because of some previous failures and subsequent runs not happening as it should. We try to give high priority to idempotency of each role and to a certain extend to the one at the playbook level but we know some roles (like repository) still need to be improved.
Do you have - by any chance - the permissions of the files before you changed them manually? Do you know what's the default umask used on the system?

@Fikili
Copy link
Contributor Author

Fikili commented Apr 25, 2022

Hi @alxgomz,

Ok, I am going to clean everything according to https://github.com/Alfresco/alfresco-ansible-deployment/blob/master/docs/deployment-guide.md#cleanup Then I will start a new installation and let you know the umask.
Anyway, the problem during bootstrap was related to permissions, I thought it is because of root ownership so I fixed that and then it continued properly.

@Fikili
Copy link
Contributor Author

Fikili commented Apr 25, 2022

First permission problem occurs for ActiveMQ:

RUNNING HANDLER [../roles/activemq : restart-activemq] ********
fatal: [activemq_1]: FAILED! => {"changed": false, "msg": "Unable to start service activemq: Job for activemq.service failed because the control process exited with error code.\nSee \"systemctl status activemq.service\" and \"journalctl -xe\" for details.\n"}

Error visible in journal:

-- Unit activemq.service has begun starting up.
Apr 25 08:16:32 activemq.sh[193402]: /opt/alfresco/activemq.sh: line 17: /opt/apache-activemq-5.16.4/bin/activemq: Permission denied
Apr 25 08:16:32 systemd[1]: activemq.service: Control process exited, code=exited status=126
Apr 25 08:16:32 systemd[1]: activemq.service: Failed with result 'exit-code'.

Line 17 as well as only part that could be related to permission issues is ${ACTIVEMQ_HOME}/bin/activemq $* in /opt/alfresco/activemq.sh.

List of files with umask under ${ACTIVEMQ_HOME}:

# ll ${ACTIVEMQ_HOME}
total 18112
-rwxr-xr-x. 1 alfresco alfresco 18471406 Jan 31 03:02 activemq-all-5.16.4.jar
drwxr-x---. 5 root     root         4096 Apr 25 08:16 bin
drwxr-xr-x. 2 alfresco alfresco     4096 Apr 25 08:16 docs
drwxr-xr-x. 7 alfresco alfresco     4096 Jan 31 03:02 examples
drwxr-xr-x. 6 alfresco alfresco     4096 Apr 25 08:16 lib
-rw-r--r--. 1 alfresco alfresco    40581 Jan 31 03:02 LICENSE
-rw-r--r--. 1 alfresco alfresco     3334 Jan 31 03:02 NOTICE
-rw-r--r--. 1 alfresco alfresco     2611 Jan 31 03:02 README.txt
drwxr-xr-x. 6 alfresco alfresco     4096 Apr 25 08:16 webapps
drwxr-x---. 3 root     root         4096 Apr 25 08:16 webapps-demo

List of files with umask under ${ACTIVEMQ_HOME}/bin/:

# ll ${ACTIVEMQ_HOME}/bin/
total 156
-rwxr-xr-x. 1 alfresco alfresco 26694 Jan 31 03:02 activemq
-rwxr-xr-x. 1 alfresco alfresco  6190 Jan 31 03:02 activemq-diag
-rw-r--r--. 1 alfresco alfresco 15940 Jan 31 03:02 activemq.jar
-rw-r--r--. 1 alfresco alfresco  5598 Jan 31 03:02 env
drwxr-x---. 2 root     root      4096 Apr 25 08:16 linux-x86-32
drwxr-x---. 2 root     root      4096 Apr 25 08:16 linux-x86-64
drwxr-x---. 2 root     root      4096 Apr 25 08:16 macosx
-rw-r--r--. 1 alfresco alfresco 83820 Jan 31 03:02 wrapper.jar

Let me know if you need more info.
BTW, I used $ git pull before this test in order to work on the latest code from master branch.

@Fikili
Copy link
Contributor Author

Fikili commented Apr 29, 2022

Hi @alxgomz,

Were you able to reproduce the issue? Do you need more info from me?

BTW, next week my colleague will try the deploy on his laptop, I will inform you if he faces the same issue.

@alxgomz
Copy link
Contributor

alxgomz commented May 2, 2022

Hi @Fikili ,
I could witness after deployment the artifacts are indeed owned by root but that did not cause any issue on my local env. I tried it deploying using a remote VM installed with ubuntu20.04
After fixing the other issue you reported (regarding setenv.sh) evrything went well regardless of the ownerships of these artifacts.
Let us know how things go for your colleague
Regards,

@gionn
Copy link
Member

gionn commented May 3, 2022

@Fikili could you report which is the default umask on target os for the root user with:

$ umask
022

if it's 027 that's probably the source of the issue.

@Fikili
Copy link
Contributor Author

Fikili commented May 4, 2022

Hi guys,
I am not able to check that today. It will be checked tomorrow. Anyway, based on following article, the default umask is 022
https://docs.microsoft.com/en-us/windows/wsl/file-permissions

@Fikili
Copy link
Contributor Author

Fikili commented May 4, 2022

Hi @gionn,
Expected output can be seen in Ubuntu WSL2:

$ umask
0022

Tomorrow, my colleague will use the Ansible installer using WSL as well. I'll inform you about a result.

@Fikili
Copy link
Contributor Author

Fikili commented May 6, 2022

Hi @gionn and @alxgomz,
FYI, my colleague faces the same issues. As a solution, the best would be that all artifacts are owned by alfresco application user defined in "roles/common/vars/main.yml". In our case, we have different user for DEV, QA and PROD and I can see that some parts use hardcoded alfresco string as username -> I will create separate issue or pull request for that.

In addition, I can see that f.e. apply_amps.sh contains sudo and even some non-existing folders and based on the best practice, you shouldn't have application user with sudo rights.

Last but not least, thank you for your help and I really appreciate that you improve the project actively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants