Skip to content

Conversation

@AlexBlueSteele
Copy link
Contributor

@AlexBlueSteele AlexBlueSteele commented Oct 8, 2025

Ticket: #2186
Test run output:
File index

{
  "_index": "test",
  "_id": "REDACTED",
  "_score": null,
  "_source": {
    "meta": {
      "raw": {
        "X-TIKA:Parsed-By": "org.apache.tika.parser.DefaultParser",
        "X-TIKA:Parsed-By-Full-Set": "org.apache.tika.parser.DefaultParser",
        "resourceName": "test.xml",
        "Content-Type": "application/xml"
      }
    },
    "file": {
      "extension": "xml",
      "content_type": "application/xml",
      "created": "REDACTED",
      "last_modified": "REDACTED",
      "last_accessed": "REDACTED",
      "indexing_date": "REDACTED",
      "filesize": REDACTED,
      "filename": "test.xml",
      "url": "file://REDACTED//test.xml",
      "indexed_chars": REDACTED
    },
    "path": {
      "root": "REDACTED",
      "virtual": "\\REDACTED\\test.xml",
      "real": "\\\\REDACTED\\test.xml"
    },
    "attributes": {
      "owner": "DOMAIN1\\testuser",
      "permissions": 0,
      "acl": [
        {
          "principal": "DOMAIN1\\testuser",
          "type": "ALLOW",
          "permissions": [
            "APPEND_DATA",
            "DELETE",
            "DELETE_CHILD",
            "EXECUTE",
            "READ_ACL",
            "READ_ATTRIBUTES",
            "READ_DATA",
            "READ_NAMED_ATTRS",
            "SYNCHRONIZE",
            "WRITE_ACL",
            "WRITE_ATTRIBUTES",
            "WRITE_DATA",
            "WRITE_NAMED_ATTRS",
            "WRITE_OWNER"
          ]
        },
        {
          "principal": "DOMAIN1\\Test Admins",
          "type": "ALLOW",
          "permissions": [
            "APPEND_DATA",
            "DELETE",
            "DELETE_CHILD",
            "EXECUTE",
            "READ_ACL",
            "READ_ATTRIBUTES",
            "READ_DATA",
            "READ_NAMED_ATTRS",
            "SYNCHRONIZE",
            "WRITE_ACL",
            "WRITE_ATTRIBUTES",
            "WRITE_DATA",
            "WRITE_NAMED_ATTRS",
            "WRITE_OWNER"
          ]
		}
    }
  },
  "highlight": {
    "path.virtual": [
      "<b>\\REDACTED\\test.xml</b>"
    ],
    "file.filename": [
      "<b>test.xml</b>"
    ]
  },
  "sort": [
    REDACTED
  ]
}

Folder index

    {
        "path": {
            "root": "REDATED",
            "virtual": "Group3Folder",
            "real": "W:\\Group3Folder"
        },
        "file": {
            "content_type": "text/directory",
            "created": "REDACTED",
            "last_modified": "REDACTED",
            "last_accessed": "REDACTED",
            "filename": "Group3Folder"
        },
        "attributes": {
            "owner": "DOMAIN\\FileShareManagers",
            "permissions": 0,
            "acl": [
                {
                    "principal": "DOMAIN\\group3",
                    "type": "ALLOW",
                    "permissions": [
                        "APPEND_DATA",
                        "DELETE",
                        "DELETE_CHILD",
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE",
                        "WRITE_ACL",
                        "WRITE_ATTRIBUTES",
                        "WRITE_DATA",
                        "WRITE_NAMED_ATTRS",
                        "WRITE_OWNER"
                    ],
                    "flags": [
                        "DIRECTORY_INHERIT",
                        "FILE_INHERIT",
                        "INHERIT_ONLY"
                    ]
                },
                {
                    "principal": "BUILTIN\\Administrators",
                    "type": "ALLOW",
                    "permissions": [
                        "APPEND_DATA",
                        "DELETE",
                        "DELETE_CHILD",
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE",
                        "WRITE_ACL",
                        "WRITE_ATTRIBUTES",
                        "WRITE_DATA",
                        "WRITE_NAMED_ATTRS",
                        "WRITE_OWNER"
                    ],
                    "flags": [
                        "DIRECTORY_INHERIT",
                        "FILE_INHERIT"
                    ]
                },
                {
                    "principal": "DOMAIN\\Domain Users",
                    "type": "ALLOW",
                    "permissions": [
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE"
                    ]
                },
                {
                    "principal": "DOMAIN\\service accounts",
                    "type": "ALLOW",
                    "permissions": [
                        "APPEND_DATA",
                        "DELETE",
                        "DELETE_CHILD",
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE",
                        "WRITE_ACL",
                        "WRITE_ATTRIBUTES",
                        "WRITE_DATA",
                        "WRITE_NAMED_ATTRS",
                        "WRITE_OWNER"
                    ],
                    "flags": [
                        "DIRECTORY_INHERIT",
                        "FILE_INHERIT"
                    ]
                },
                {
                    "principal": "DOMAIN\\FileShareManagers",
                    "type": "ALLOW",
                    "permissions": [
                        "APPEND_DATA",
                        "DELETE",
                        "DELETE_CHILD",
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE",
                        "WRITE_ACL",
                        "WRITE_ATTRIBUTES",
                        "WRITE_DATA",
                        "WRITE_NAMED_ATTRS",
                        "WRITE_OWNER"
                    ]
                },
                {
                    "principal": "\\CREATOR OWNER",
                    "type": "ALLOW",
                    "permissions": [
                        "APPEND_DATA",
                        "DELETE",
                        "DELETE_CHILD",
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE",
                        "WRITE_ACL",
                        "WRITE_ATTRIBUTES",
                        "WRITE_DATA",
                        "WRITE_NAMED_ATTRS",
                        "WRITE_OWNER"
                    ],
                    "flags": [
                        "DIRECTORY_INHERIT",
                        "FILE_INHERIT",
                        "INHERIT_ONLY"
                    ]
                },
                {
                    "principal": "NT AUTHORITY\\SYSTEM",
                    "type": "ALLOW",
                    "permissions": [
                        "APPEND_DATA",
                        "DELETE",
                        "DELETE_CHILD",
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE",
                        "WRITE_ACL",
                        "WRITE_ATTRIBUTES",
                        "WRITE_DATA",
                        "WRITE_NAMED_ATTRS",
                        "WRITE_OWNER"
                    ],
                    "flags": [
                        "DIRECTORY_INHERIT",
                        "FILE_INHERIT"
                    ]
                },
                {
                    "principal": "DOMAIN\\FileShareManagers",
                    "type": "ALLOW",
                    "permissions": [
                        "APPEND_DATA",
                        "DELETE",
                        "DELETE_CHILD",
                        "EXECUTE",
                        "READ_ACL",
                        "READ_ATTRIBUTES",
                        "READ_DATA",
                        "READ_NAMED_ATTRS",
                        "SYNCHRONIZE",
                        "WRITE_ACL",
                        "WRITE_ATTRIBUTES",
                        "WRITE_DATA",
                        "WRITE_NAMED_ATTRS",
                        "WRITE_OWNER"
                    ],
                    "flags": [
                        "DIRECTORY_INHERIT",
                        "FILE_INHERIT",
                        "INHERIT_ONLY"
                    ]
                }
            ]
        },
        "_id": "REDACTED"
    },

Note

Adds fs.acl_support to collect and index file/folder ACLs, updates mappings/templates, wiring, docs, and tests.

  • Framework/Core:
    • Introduces FileAcl and FsCrawlerUtil#getFileAcls(...); adds ACL extraction for local files.
    • Extends beans.Attributes with acl and beans.Folder with attributes; propagate attributes/ACLs in FsParserAbstract.
    • FileAbstractModel carries acls; FileAbstractorFile fills ACLs (FTP/SSH return empty lists).
  • Settings:
    • New flag fs.acl_support (+ defaults, parsing, validation warning if attributes_support is false).
  • Elasticsearch:
    • Add component template fscrawler_mapping_attributes with attributes.acl.{principal,type,permissions,flags}.
    • Include attributes component in folders index template; load it at startup.
  • Docs:
    • Document acl_support in admin guides (index.rst, local-fs.rst, rest.rst) with examples.
  • Tests:
    • Add ACL-related tests (FsCrawlerUtilTest, JsonUtilTest, settings loader/parser tests) and update sample configs.

Written by Cursor Bugbot for commit b61843c. This will update automatically on new commits. Configure here.

@dadoonet
Copy link
Owner

@AlexBlueSteele Do you want to continue on this PR? If so, could you make sure that all the tests are passing locally?

And then I can start reviewing the code ;)

@AlexBlueSteele
Copy link
Contributor Author

@AlexBlueSteele Do you want to continue on this PR? If so, could you make sure that all the tests are passing locally?

And then I can start reviewing the code ;)

I will work on this. Thanks!

@AlexBlueSteele AlexBlueSteele marked this pull request as draft November 6, 2025 16:03
@sonarqubecloud
Copy link

sonarqubecloud bot commented Nov 6, 2025

@dadoonet
Copy link
Owner

dadoonet commented Nov 6, 2025

I'm wondering if we need the acl_support setting? I believe that the other one is enough. WDYT?

@AlexBlueSteele
Copy link
Contributor Author

I'm wondering if we need the acl_support setting? I believe that the other one is enough. WDYT?

I think it's nice to have because some people might not care about ACLs when crawling on windows. It adds slight overhead in storage and performance.

@AlexBlueSteele
Copy link
Contributor Author

@dadoonet I updated the demo data above to reflect the ACL collection for the folder index as well. The final part that I think needs to be changed its the verbosity of the debug statements. I don't know how verbose you want it. Thoughts?

@dadoonet dadoonet marked this pull request as ready for review November 18, 2025 15:21
Copy link
Owner

@dadoonet dadoonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great start and thank you so much for working on this.
Bonus point for writing some documentation ;)

Could you fix the issues I mentioned before?

Also, next steps would be IMO to add if possible an Integration Tests about this. May be in FsCrawlerTestAttributesIT?

Would that be possible for you?

Also did you run the code locally and does it do what you are expecting?

I'm still wondering if we really need another acl_support attribute or consider that we want want to collect all the available metadata on files and folders whenever we are asking for attributes...

@AlexBlueSteele
Copy link
Contributor Author

That's a great start and thank you so much for working on this. Bonus point for writing some documentation ;)

Could you fix the issues I mentioned before?

Also, next steps would be IMO to add if possible an Integration Tests about this. May be in FsCrawlerTestAttributesIT?

Would that be possible for you?

Also did you run the code locally and does it do what you are expecting?

I'm still wondering if we really need another acl_support attribute or consider that we want want to collect all the available metadata on files and folders whenever we are asking for attributes...

I will work on this as soon as I get the chance! Thanks for spending the time to CR!!

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants