In case anyone is clustering large datasets:
in my experiments (40M corpus and NofClusters=1000), turning on compiler optimization with "-O3" yields speed-ups of around 3.
I changed the following lines in my Makefile:
wcluster: $(files)
g++ -Wall -g -O3 -o wcluster $(files)
%.o: %.cc
g++ -Wall -g -O3 -o $@ -c $<
In case anyone is clustering large datasets:
in my experiments (40M corpus and NofClusters=1000), turning on compiler optimization with "-O3" yields speed-ups of around 3.
I changed the following lines in my Makefile: