Skip to content

HDDS-11966. Add recon support#16

Merged
adoroszlai merged 4 commits intoapache:mainfrom
fraochan:add-recon-sts
Dec 18, 2025
Merged

HDDS-11966. Add recon support#16
adoroszlai merged 4 commits intoapache:mainfrom
fraochan:add-recon-sts

Conversation

@fraochan
Copy link
Contributor

@fraochan fraochan commented Dec 15, 2024

What changes were proposed in this pull request?

This PR adds the files required for an optional Recon.

Recon does not require a persistent volume. I hesitated to do a deployment, but the Apache Ozone Git project suggests a statefulset as an example. I don't mind converting it to a deployment if you find it necessary.

What is the link to the Apache JIRA

HDDS-11966

How was this patch tested?

I created a local cluster with kind, then :

  • I deployed an Apache Ozone cluster with Recon:
helm install ozone ./charts/ozone --set recon.enabled=true    
  • I waited for the datanodes to register with Recon and checked that the datanodes were listed in the Recon UI localhost:9888/#/Datanodes with a port-forward:
kubectl port-forward svc/ozone-recon 9888:9888
  • I created a bucket with aws-cli:
aws s3api --endpoint http://localhost:9878 create-bucket --bucket=wordcount
helm upgrade ozone ./charts/ozone --set recon.enabled=false

Copy link

@ptlrs ptlrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @fraochan. It mostly looks good. I have a few comments.

Comment on lines +60 to +61
- name: WAITFOR
value: {{ $.Release.Name }}-scm-0.{{ $.Release.Name }}-scm-headless:{{ .Values.scm.service.port }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider if waiting for another pod is strictly necessary.
If we do have to wait then we should do it in an initcontainer.

Also, we should use the SCM service instead of addressing the pod directly. At any point we have to consider the pod -scm-0 being down and should instead rely on other SCM pods (if any in HA mode) by using the SCM service.

Copy link
Contributor Author

@fraochan fraochan Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was inspired by the Apache Ozone Kubernetes examples of the source repository. But indeed, I agree with you.

Recon runs in a loop until it can communicate with the SCM headless service. In my opinion, deleting this part is enough.

Otherwise, I'll do an initContainer (although I'm not sure yet what I'll do as a check to make Recon wait, probably not too hard but I haven't thought about it yet).

{{- $tolerations := or .Values.recon.tolerations .Values.tolerations }}
{{- $securityContext := or .Values.recon.securityContext .Values.securityContext }}
apiVersion: apps/v1
kind: StatefulSet
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider making this pod stateless by using a Deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have been more confident, that was my initial suggestion. But in line with the rest of the Helm Chart, I've made it a Statefulset.

So I completely agree with you and have made the changes.

By the way, I'd say the S3 gateway is also stateless and could be a deployment, couldn't it?

value: "true"
{{- if .Values.recon.enabled }}
- name: OZONE-SITE.XML_ozone.recon.address
value: {{ include "ozone.recon.pod" . }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should point to the Recon service name instead of the pod.
We should be able to refer to the recon pod without specifying the type of the pod. This is where the service comes into the picture.

ozone.recon.pod defined in this file above can be removed if not needed elsewhere.

Copy link
Contributor Author

@fraochan fraochan Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had to add the rpc port to the service as well, but yes,we might use the service.

@fraochan fraochan changed the title Add recon support HDDS-11966. Add recon support Dec 19, 2024
@fraochan
Copy link
Contributor Author

@ptlrs, thanks for the constructive review, I've modified the code accordingly and I'm ready for another one :)

@adoroszlai adoroszlai requested a review from ptlrs January 31, 2025 16:19
@ivandika3 ivandika3 requested review from devmadhuu and removed request for ptlrs July 7, 2025 01:34
Copy link

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fraochan for the patch. Though I am not aware of internal details, but overall changes LGTM +1 for Recon ports and service deployment.

@MadanDhungel
Copy link

hey i would like to know when will the apache ozone officially merge and reflect these values on the branch, i am currently trying to deploy in K8s and i am unable to get svc/recon. Also any pointer will be much helpful.

@adoroszlai
Copy link
Contributor

Thanks @MadanDhungel for pinging, sorry, I totally forgot about this.

@adoroszlai
Copy link
Contributor

@dnskr please take a look

@MadanDhungel
Copy link

In the main banch i did not see any folder with name recon
main-branch

in the changed files there is recon with service and development file but is not reflected to main branch, other changes were also not reflected .
files-changed

is there another process or is the recon depreciated or whats going on

image

do i need to install ozone's recon UI from different ways .
@adoroszlai

@adoroszlai
Copy link
Contributor

@MadanDhungel This is not yet merged. Please checkout the PR:

git fetch https://github.com/apache/ozone-helm-charts refs/pull/16/head && git checkout FETCH_HEAD

Then you can try Recon as being added by this PR. Thank you for testing it.

@Madan-Dhungel
Copy link

hey thanks for the info, i tried and was successful in deploying in K8s cluster, all pods are running fine with services being exposed.

Huge thanks to you @adoroszlai

@adoroszlai adoroszlai merged commit 0463c8c into apache:main Dec 18, 2025
1 check passed
@adoroszlai
Copy link
Contributor

Thanks @fraochan for the patch (sorry for the late response), @devmadhuu, @ptlrs for the review, @Madan-Dhungel for testing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants