Skip to content

Conversation

@gratural
Copy link
Contributor

@gratural gratural commented Apr 1, 2025

The patch fixes several issues in Linux systems:

  1. Eliminates CMake warnings about using the deprecated FetchContent_Populate
  2. Fixes opening local world maps (but their names must be in lowercase)
  3. Add few improvements for asset_browser:
  • makes the filter actually functional
  • alphabetical sorting
  • significantly speeds up filtering

Note: QStringList performs filtering and sorting significantly faster than std::unordered_set.
On my system filtering with std::unordered_set took around 7 seconds. Same operation with QStringList completes in 1 second.

@gratural gratural closed this Apr 1, 2025
@gratural gratural deleted the build-linux branch April 1, 2025 07:36
@gratural gratural restored the build-linux branch April 1, 2025 08:11
@gratural gratural reopened this Apr 1, 2025
sorted_listfile.sort();

for (std::string const& file : gListfile)
for (const QString& file : sorted_listfile)
Copy link
Contributor

@bloerwald bloerwald Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes zero sense. A copy and sorting (which wasn't there before) is faster than just iterating?

I assume it is

for (asset_tree_node& child : children)
{
if (child_name == child.name)
{
return child;
, which is now more likely to find the node faster than needing to iterate the entire list. The real™ fix there is to not use a vector but something with a faster lookup.

struct asset_tree_node
{
  asset_tree_node(std::string name) : name(name) {}

  asset_tree_node& add_child(std::string const& child_name)
  {
    return children.emplace(child_name, child_name).first->second;
  }

  std::string name;
  std::map<std::string, asset_tree_node> children;
};

has obviously worse space requirement (storing every string twice), but should be way faster, without sorting and copying first.

Can you please compare with your current solution (and maybe original, and post times)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes zero sense. A copy and sorting (which wasn't there before) is faster than just iterating?
Exactly... double copying, sorting, but it’s still noticeably faster than it was before.

I compared: map is insignificantly faster (6-7 seconds) than vector (7-8 seconds).
QList still leads with its 1-2 seconds.
I thought maybe it’s related to my system: the version of std, gcc, or Qt5 or Qt6...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that still doesn’t really make sense. Is your noggit build type release? If noggit isn’t the difference would make sense: the installed qt probably is a release build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, my Noggit build is the default one, and I only use these keys (usually they're all enabled):
cmake -DNOGGIT_WITH_SCRIPTING=OFF -DNOGGIT_ALL_WARNINGS=OFF -DNOGGIT_BINDLESS_TEXTURES=OFF ../noggit3

On Linux systems, having an up-to-date QT5 (or even QT6) is absolutely essential.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I did some benchmarking here to understand it. The real gain comes from std::regex vs QRegularExpression.

  • Matching filenames with std::regex vs QRegularExpression is about 10 seconds slower.
  • The split by / on regex vs QString::split() is 900ms slower.
  • vector+search vs map is about 500ms slower.
  • Using a map, I was unable to measure a benefit from sorting, it made the whole thing 50ms slower, which is to be expected.

This combined, I arrived at the conclusions:

  • use map to get sorting for free
  • use QString and QRegularExpression to gain copy-on-write and way better regex
  • use QString::split()

In my setup, the whole create_tree() function is now 410ms, of which more time is spent adding to the UI widget tree than preparing.

I pushed those changes to default, so you might want to rebase and remove the commits touching this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, the asset search really works perfectly now.
My old commit is no longer needed.
But there’s one more feature to add...


std::stringstream ssfilename;
ssfilename << "World\\Maps\\" << lMapName << "\\" << lMapName << ".wdt";
ssfilename << "world/maps/" << lMapName << "/" << lMapName << ".wdt";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While these changes seem to fix the issue, the actual issue isn't fixed (and new cases might be added in the future).

Make it getDiskPath(noggit::mpq::normalized_filename(filename)) there and you should be able to avoid the changes in path assembling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right, there will definitely be new requests.
I think this isn't enough, and we need to preserve the case sensitivity of file paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theorically, fs::path could preserve local file case sensitivity, but I really dislike using it – it's extremely lacks functionality.
Maybe we should migrate to Boost or Qt?

I get the impression that it was left unfinished.

  • It lacks methods for dynamically finding separators.
  • When working with strings, you constantly have to convert them, trim separators, and add them back in loops. And for output - you have to use a special method.
  • And my personal favorite - the way it gets initialized:
  std::stringstream ssfilename;
  std::filesystem::path wdt_path("World") / "Maps" / lMapName / std::string(lMapName + ".wdt");

  ssfilename << noggit::mpq::resolved_filepath(wdt_path);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one small question: what would your position be on removing normalization here:

/*
* basic constructor to save the file to project path
*/
MPQFile::MPQFile(std::string const& filename)
  : eof(true)
  , pointer(0)
  , External(false)
 // , _filename(noggit::mpq::normalized_filename(filename))
  , _filename(filename)
  , _disk_path (getDiskPath (_filename))

I tested opening and saving ADT files, adding tilesets and models to them.
Interestingly, the client prefers taking files from the 'World' directory rather than 'world'.
I always thought it prioritized newer files...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that case sensitivity is probably something you don't want to preserve, since data is extremely shit in that regard, having World, WORLD and world etc. I see that having force-lowercase on linux is mediocre, but honestly, it is better than actually having to preserve all the weird case depending on which version of which folder you're referencing. There even are cases where the same file uses different case combinations.

If you do want to somehow fix that, imo it should be wrapped within the stuff actually checking existence/opening from disk, and it should there try to do a case insensitive search. Yes, that would require listing all directories and trying to match, keeping track of the various variations until finding the file.

I'd very much prefer not having to change any filename everywhere in noggit since that also ignores data provided filenames, not solving the issue after all.

I'm totally pro using std::filesystem::path, but in the cases where it comes from data you'll still have the normalisation required.

When loading files, paths are always normalized, on a layer below as well. Blizzard does the same. It should be entirely irrelevant what casing they use within MPQs. There is no difference there.

Copy link
Contributor Author

@gratural gratural Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly what I thought, and in commit 508eefc I added the uni_path() function.
The uni_path() function handles all WDT/ADT file access with this logic:

  1. First attempts World/Maps path (case-sensitive check)
  2. If exists → uses canonical path exactly as stored
  3. If not found → falls back to normalized lowercase path

As a result:

  • MPQ archives: preserved canonical filenames
  • Local files: will be saved using the same path

if (!MPQFile::exists(_filename))
{
LogError << "file \"World\\Maps\\" << basename << "\\" << basename << ".wdl\" does not exist." << std::endl;
LogError << "file " << filename.str() << " does not exist." << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep even when reverting manual normalize

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants