LIReC - Library of Integer Relations and Constants#23
LIReC - Library of Integer Relations and Constants#23itaybthl wants to merge 29 commits intoRamanujanMachine:masterfrom
Conversation
don't need setup.cfg if we just use setup.py also currently some of our packages don't support python3.11, so the latest version we can use is 3.10.8
restructured and trimmed down, and currently untested...
p much ready for tomorrow, have plenty of relations on the boinc pcfs scout role is for strangers intending to find new relations, pioneer role is for people closer to us (e.g. BOINC) that (also?) want to add PCFs. (named constants will still be added manually by superuser for now)
this is the name we decided on!
noamzaks
left a comment
There was a problem hiding this comment.
Here are a few notes you may find useful :)
LIReC/create_db.py
Outdated
| print(f'Using {precision} digits of precision') | ||
| Constants.set_precision(precision) | ||
| db = db_access.LIReC_DB() | ||
| for x in Constants.__dict__.keys(): |
There was a problem hiding this comment.
Maybe call x something like "function_name"? Is it always a function?
There was a problem hiding this comment.
Well, it is supposed to be limited to functions computing constants, so const would probably be the better call.
LIReC/create_db.py
Outdated
| continue | ||
| print(f'Adding named constant {x}') | ||
| named_const = models.NamedConstant() | ||
| const_func = eval(f'Constants.{x}') |
There was a problem hiding this comment.
I think avoiding eval is best, maybe something like Constants.__dict__[x] can work
There was a problem hiding this comment.
Good call, no idea why I didn't think of that since I already call __dict__. There's another part of the code that also uses eval, which I can modify to take from __dict__ instead in a similar way, so I'll do both.
There was a problem hiding this comment.
Using dict is also probably not the best afaik, but it would require changing how the Constants file is laid out...
There was a problem hiding this comment.
Is there a better way? I don't think converting lib/calculator.py to a massive switch case or a massive dictionary would be better... There's also the other code I mentioned, and getting rid of calls to eval or __dict__ there would probably involve just circumventing them with a literal dictionary of string to class, which kinda sounds dumb...
| result = get_webpage(url) | ||
| preview(result_filename, result, schema) | ||
|
|
||
| ''' |
There was a problem hiding this comment.
Document this? Maybe later we can think of creating a documentation site for the code (probably something auto-generated like doxygen)
There was a problem hiding this comment.
Going with the monorepo idea, and considering I'll need to upgrade this to suit more general use cases that LIReC will need, I think for now the best call would be to document it there. (also yes, currently this is detached from the rest of the code, but is intended as a feature soon enough)
LIReC/lib/db/start_db.sh
Outdated
| @@ -0,0 +1,3 @@ | |||
| docker rm ramanujan_db | |||
| docker pull postgres | |||
| docker run --name ramanujan_db -v /home/amir/projects/current_projects/ramanujanpriv/DB/data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=123456 -p 5432:5432 -d postgres | |||
There was a problem hiding this comment.
Should this /home/amir/projects/... path be hard-coded?
Could it cause issues for other devices?
Is it supposed to be run by other devices?
There was a problem hiding this comment.
Honestly, the only script I ever used in that folder (or suspect I'll ever need) is create_db.sql. I haven't looked into the other scripts, and I'd give a good chance they're redundant, though all I did was copypaste from the other repo. If they are indeed redundant I'll delete them.
LIReC/lib/manual_values/B_2.txt
Outdated
| @@ -0,0 +1,10 @@ | |||
| # A065421 (b-file synthesized from sequence entry) | |||
There was a problem hiding this comment.
Should all these manual_values files be in git version control?
There was a problem hiding this comment.
I could probably figure out an automatic way to download these from OEIS, then discard once we're done...
There was a problem hiding this comment.
Not sure I completely understand what these do, but in any case - I think generally they should be in source control iff you want to commit every time they change (and if they are some sort of cache of the LIReC then I think you don't)
There was a problem hiding this comment.
Well, as I mentioned these are taken directly from the online encyclopedia of integer sequences, and are mostly intended to be part of creating a new database. So, instead of having this massive folder, whenever necessary I can download constants from OEIS into the new DB pretty much directly (up to parsing their format, but I got that covered already).
There was a problem hiding this comment.
I have worked in the past with a build system called Bazel which allows you to download artifacts during build time and use them in your code. Maybe we should consider uploading these files to an artifactory and integrating with a tool that allows us to download and use them on demand?
There was a problem hiding this comment.
Well, the only time we'd ever need these files is when creating a new DB, which I'd imagine would happen pretty rarely. Since create_db.py already does most of the work, I don't think there's much need to get fancy with it by using some special build tools or some such... Nice idea though!
| 'ortools>=7.4.7247', | ||
| 'pandas>=1.0.1', | ||
| 'protobuf>=3.11.3', | ||
| 'psutil>=5.5.1', | ||
| 'psycopg2>=2.8.6', | ||
| 'pybloom-live', | ||
| 'PyLaTeX>=1.3.1', | ||
| 'pyparsing>=2.4.6', | ||
| 'pytest>=6.2.4', | ||
| 'python-dateutil>=2.8.1', | ||
| 'pytz>=2019.3', | ||
| 'scipy>=1.6.0', | ||
| 'simplejson>=3.16.0', | ||
| 'six>=1.14.0', | ||
| 'sqlacodegen>=2.3.0', | ||
| 'sympy>=1.5.1', | ||
| 'ortools>=7.4.7247', | ||
| 'pybloom-live' | ||
| 'xlrd>=2.0.1', |
There was a problem hiding this comment.
Do all people who use all code in Ramanujan need all of these dependencies?
Perhaps we can separate the dependencies into subprojects, so if someone only needs one part they don't need to have everything;
In terms of easily installing everything and running something in Docker or some environment, this is probably the best, at least for now;
Leaving this here to think about solutions similar to yarn workspaces or cargo subpackages etc.
There was a problem hiding this comment.
All I did was add the dependencies from the other repo. I haven't investigated which dependencies I actually need (and chances are I need most), but if we really want to minimize dependencies I think the best call would be to ask everyone to investigate which dependencies their projects actually need. (otherwise we can have just one guy combing through the code, but that's probably going to involve unnecessary headache...)
If subproject-specific dependencies are possible and easy enough to work with, I think that could be a good idea.
had to downgrade python to 3.8.10, for some reason pip3.10 pisses itself on EC2 and I wasted a full day of troubleshooting trying to get it to unpiss itself
kinda nice now that manual values are downloaded directly from OEIS instead of cached locally
might as well have them both under the same roof...
keeping it on source control just for a bit so we can test with it later if we want, there are some weird differences with the new "mergesort" scheme that might not be problematic after all
public spectator role, no more relation_audit table (replaced by postgres logging which is better), and added derived_constant table
now adding to pcf_canonical_constant through LIReC_DB also adds original_a and original_b
ec2 seems working now, gonna let it run for a while to see how well it works
second most noteworthy thing here is print_relations, my utility for printing relations in a somewhat more readable way. will probably be removed later once the GUI has taken shape
turns out sympy is better at handling arbitrarily large integers than mpmath also now __run_old__ is officially deprecated, because applying similar fixes to __run_old__ cause it to behave identically to run this just leaves one small issue with the new run, and that is that convergence calculation can be somewhat scuffed since fr_list isn't saved, but that's not really a problem honestly
just because
eyal suggested gmpy2, and i further optimized the code. just running to depth=8000 is significantly faster, can only imagine how fast larger numbers will be thanks to gmpy2. now can write refine and verify jobs!
still need to test exactly when it happens, but apparently "can't attach instance" exceptions can happen when trying to commit relations. this should at least allow the code to keep working while I figure out a more permanent solution.
poly_pslq has run for over 24 hours by now and has found more relations, will see how it holds up
verification job next! then moving to the separate LIReC project
can now run the same job under different configs
gonna leave this here for now, need to start summary paper
Contains code for automatically searching for integer relations on the DB. More features are planned, but for now this should be functional.