Conversation
- Introduced ExamSoft converter for processing ExamSoft files. - Registered new converters for different file types. - Updated the convert method to accept an import source. - Refactored question handling to support multiple choice questions. - Added pandoc-ruby gem for document conversion.
mpetrowi
left a comment
There was a problem hiding this comment.
This is a good start! I like the reorganization of the Gem, and how it can easily be expanded later with more converters.
I haven't looked at the conversion code in detail yet, mostly because I don't have samples of exam soft exports. Send those to me separately so I can think about the conversion decisions.
What's next:
- Fix some small comments I made, nothing major.
- Write rspec tests. I think functional tests would be appropriate. So add some fixture files in different formats, have the specs run the conversion and write rspec rules to check what happens. When you add the fixtures, make sure there isn't customer data in them.
|
This PR is to address this Atomic Assessments issue |
…t usage instructions, and improve converter registration and specs
…documents - Created a new spec file for testing the loading of questions in the AtomicAssessmentsImport module, ensuring that multiple choice questions are correctly instantiated from various input formats. - Added a spec file for utility functions, specifically testing the boolean parsing functionality with various inputs and defaults. - Introduced sample documents in different formats (DOCX, HTML, RTF) to be used as fixtures for testing the import functionality.
… and update feedback fields in ExamSoft converter
Documents the heuristic chunker + field detector approach for handling unknown ExamSoft export formats with best-effort parsing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce the ExamSoft::Chunker module with a Strategy base class and MetadataMarkerStrategy that splits HTML documents on Folder:/Type: markers. This is the first step in refactoring the ExamSoft converter from rigid regex parsing to a flexible chunker+extractor pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a chunking strategy that splits HTML documents on numbered question patterns (e.g., "1)" or "1.") while ignoring lettered answer options. Header content before the first question is captured separately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add module-level .chunk(doc) method to ExamSoft::Chunker that tries each strategy in priority order (MetadataMarker > NumberedQuestion > HeadingSplit > HorizontalRuleSplit) and returns the first valid result. Falls back to treating the entire document as a single chunk with a warning when no strategy matches. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three extractor detector classes for parsing ExamSoft question chunks: - QuestionStemDetector: extracts question text, strips metadata prefixes and explanations - OptionsDetector: finds lettered answer options with asterisk/bold correct markers - CorrectAnswerDetector: determines correct answers from option markers or Answer: labels Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Essay (longanswer) and ShortAnswer (shorttext) question type classes that inherit from Question. Update Question.load to dispatch essay, longanswer, short_answer, shorttext, and true_false question types. Also tighten the /ma/ regex to /^ma$/ to avoid false matches. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the monolithic ExamSoft converter with a pipeline that: 1. Normalizes input to HTML via Pandoc 2. Chunks the document into per-question segments 3. Extracts fields (stem, options, answers, metadata, feedback) per chunk 4. Builds Learnosity items/questions from extracted data 5. Collects warnings in :errors array instead of raising Key fixes: - Clean embedded newlines from stems and feedback text - Set template to nil (not question type) to avoid ui_style errors - Update specs to expect warnings instead of raised errors - Fix HTML spec option-removal regex to use [^<] instead of [^\}] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove commented-out code, consolidate redundant require_relative statements in exam_soft.rb, and apply safe rubocop auto-corrections (modifier if/unless, %r regexp literals, safe navigation, etc). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mpetrowi
left a comment
There was a problem hiding this comment.
This is getting really close! Apologies for not getting to this earlier, I've had it on my todo list for far too long.
A couple comments, and I'd like to figure out the tag conversion before deploying this.
… <br> children into separate <p> elements; added single-chunk warning — Fixed to collect multi-line feedback after tilde, stopping at option lines — Added F, FIB, E, and SA type codes — Added FITB correct answer detection from option texts
…onverter - Improved category extraction to handle line-wrapped categories - Updated title extraction to avoid truncation at parenthetical numbers - Enhanced FillInTheBlank question type to build stimulus with response placeholders - Added tests for new functionality in metadata and question stem detectors
- Clarified normalize_to_html method to show how it handles both file paths and file-like objects. - Updated categories_to_tags method for better key-value extraction. - Adjusted question handling for Multiple Answer types in the conversion process. - Modified metadata extraction to support new category parsing logic. - Updated Fill in the Blank and Multiple Choice question templates for consistency. - Fixed integration tests to reflect changes in question data structure.
- Added ClozeDropdown class for handling dropdown options in fill-in-the-blank questions. - Updated question_type for Essay from "longanswer" to "longtext". - Improved validation structure for FillInTheBlank and Ordering questions. - Added tests for new ClozeDropdown functionality and updated existing tests for consistency.
mpetrowi
left a comment
There was a problem hiding this comment.
The conversion code looks much more capable. I'm not surprised AI is really good at writing that. This is really close.
A few changes:
- Don't import anything that you're labeling as status
draft. Doing that would just accumulate junk in the item bank. If we do need to support a full conversion that could later be fixed by hand I think we would make passage features with the examsoft source in them, so content could be hand-authored inside AA. But that would be a future refinement. The goal is to not have to do that at all. - The pandoc-ruby gem needs to move to the gemspec
- Update the version in the gemspec
Thanks!
|
The version is here: lib/atomic_assessments_import/version.rb |
…andoc-ruby dependency
Note: this is what I consider to be a rough draft - I'm looking for suggestions on next steps. I know I need to work on clearing up a few TODO items, as well as doing more testing and documentation updates.
Copilot Summary of Changes:
This pull request introduces major improvements to the
atomic_assessments_importlibrary by refactoring its question classes, centralizing utility functions, and adding support for importing ExamSoft files in multiple formats. The changes enhance extensibility, maintainability, and allow for new import sources beyond CSV, such as RTF, DOCX, and HTML from ExamSoft.Support for ExamSoft imports:
ExamSoft::Converterclass to handle ExamSoft file formats (RTF, DOCX, HTML, XHTML) using Pandoc for conversion to HTML, and registered these converters in the main import module. This enables importing ExamSoft assessments alongside CSV. [1] [2] [3] [4]Refactoring and code organization:
QuestionandMultipleChoiceclasses from the CSV-specific namespace to a sharedQuestionsnamespace, making them reusable for multiple import sources. [1] [2] [3]Utilsmodule for shared utility functions (e.g.,parse_boolean) so it can be used by both CSV and ExamSoft converters. [1] [2]Extensibility improvements: