Format: AVI, CSV, TXT, ZIPPublisher: IEEE DataPortPublication Date of the Electronic Edition: 10/27/2025
?
ISBN: 10.21227/7v72-0a66
$15$3Discount Coupon
Delivery time: Instant
Description
ChemVQA-2K: A Visual Question Answering Dataset for Molecular Understanding???? OverviewChemVQA-2K is a novel Visual Question Answering (VQA) dataset designed to bridge chemistry and multimodal AI.It contains approximately 2,000 high-resolution molecular images (512×512) generated from valid SMILES strings, accompanied by 10 structured Q&A pairs per molecule, resulting in ~20,000 image-question-answer triplets.Each image represents a 2D chemical structure rendered using RDKit, while each question tests the model’s ability to reason over molecular features such as formula, atom counts, bonds, functional groups, and polarity.???? Dataset StructureComponentDescriptionChemVQA_2K_images.zip2,000 molecule renderings (mol_0.png, mol_1.png, …)ChemVQA_2K_full.csvComplete dataset with columns: id, image_name, question, answerEach record follows:{ "id": "mol_123", "image_name": "mol_123.png","question": "What is the molecular formula of this molecule?", "answer": "C6H6O2"}???? Example QuestionsEach molecule has multiple Q&A pairs, e.g.:QuestionExample AnswerWhat is the molecular formula of this molecule?C₂H₅OHWhat is the molecular weight?46.07 g/molHow many total atoms are present?9Which functional groups are present?AlcoholIs the molecule polar or non-polar?Polar⚙️ Data Generation ProcessMolecules generated by concatenating random organic fragments and validated using RDKit.Each molecule’s image created with Draw.MolToFile() at 512×512 px resolution.Functional groups detected via SMARTS pattern matching.Q&A pairs auto-generated from chemical descriptors (MolWt, CalcMolFormula, substructure matches).???? Intended UseChemVQA-2K is ideal for:Fine-tuning Vision-Language Models (VLMs) for scientific visual reasoning.Developing chemistry-aware question answering systems.Training vision encoders on molecular visual patterns.Exploring RL-based visual understanding of chemical structures.???? Dataset StatisticsPropertyValueImages1924Image resolution512×512 pxQ&A pairs19240Functional groups detected16File size (approx.)~25 MB (images + CSVs)???? Potential Research DirectionsMultimodal Chemistry Understanding — connecting visual structure with symbolic reasoning.Scientific Vision-Language Pretraining — use as domain-specific VQA benchmark.Explainable Chemistry AI — models that describe functional features and molecular properties.
$15$3Discount Coupon
Delivery time: Instant
Offline Request
If your request can be solved, it will be priced. After receiving your payment, we will proceed your order.