mForm V2 form JSON — shape analysis
Companion to
01-kickoff-brief.md §10(the import pipeline). This doc is for the dev who will write A15 (probe script) and A16 (import pipeline). Source files live at~/Dhwani/swasti-mform-migration/raw/mform-forms/(extracted fromswasti_forms_json.zip).
TL;DR — the three things that matter most
- Self-linking is
getDynamicOptionwithformId == own publishedFormId. Forms1003,1005,1011each contain agetDynamicOptionentry pointing at themselves. ThedataOrdersMappinginside that entry says which fields are copied forward from the previous follow-up to the new one (i.e., how the response chain is glued together). TheclosedItselfarray on form1011contains the terminal statuses that stop self-linking — this is the explicit list we need to preserve when we collapse N self-linked responses into one Frappe record + N audit-log rows. - There is no separate “form name” field. The human-readable title (
"3.2 Scheme Followup", etc.) is buried atlanguage[0].title. Question text is atlanguage[i].question[].title. Everywhere else in the form, questions are referenced by their numericorder— not by_id, not byshortKey. Order is a string. A migration that loses the order-to-Doctype-fieldname mapping is unrecoverable. - Languages: only
enwas exported. All 13 forms havelanguagelength 1 withlng="en". The Kannada/Telugu copy lives elsewhere (probably a system-level translations table not in this dump). Confirm with the mForm team before assuming we have the full picture for §P7 of the kickoff brief.
1. Inventory
13 files, all flat at the root of the zip, named <publishedFormId>.json. Two formIds — internal (formId, monotonic 1–14, gap at 13) and published (publishedFormId, the file name). All forms are isMaster=false even though some are clearly masters (Scheme Master, Document Master, Donor Master) — isMaster is unreliable; trust the title.
| File | formId | pubId | Title | Qs | Sections (input_type=10) |
|---|---|---|---|---|---|
1000.json | 1 | 1000 | 2. Member Profiling | 56 | 1 |
1001.json | 2 | 1001 | 1.1 Scheme Master | 12 | 0 |
1002.json | 3 | 1002 | 3.1 Scheme Application | 43 | 5 |
1003.json | 4 | 1003 | 3.2 Scheme Followup ★self-link | 37 | 9 |
1004.json | 5 | 1004 | 4.1 Document Application | 39 | 5 |
1005.json | 6 | 1005 | 4.2 Document Application Follow-up ★self-link | 39 | 9 |
1006.json | 7 | 1006 | Profile Form | 2 | 0 |
1007.json | 8 | 1007 | Filter Form | 3 | 0 |
1008.json | 9 | 1008 | 1.2 Document Master | 2 | 0 |
1009.json | 10 | 1009 | Donor Master | 6 | 0 |
1010.json | 11 | 1010 | 5.1 Health Screening | 90 | 11 |
1011.json | 12 | 1011 | 5.2 Health Screening Follow-up ★self-link | 97 | 10 |
1012.json | 14 | 1012 | up-test | 57 | 1 |
Bold = main programs from the kickoff. ★ = forms that self-link (response-sprawl pattern → must collapse to audit log on import).
Missing from this dump
- No system geography master (state/district/block/GP/village data) — geography options are pulled from a system-level master not included.
- No translations for Kannada/Telugu.
- No system master for surveyors/users.
- No Livelihood program forms (the 4th, new program) — expected, since it isn’t on V2 yet.
up-test(1012) is a developer test form — exclude from production import.
2. Top-level form schema
{
"_id": "66e912180388f75534092900", // Mongo ObjectID
"formId": 1, // internal, monotonic per project
"publishedFormId": 1000, // public ID == file name
"publish": true, "hold": false, "isLocked": false,
"isMaster": false, // unreliable, see §1
"isCustomLabel": false,
"repairImportedQuestion": true,
"formType": null,
"projectOrder": 0, "reportedDateOrder": 0, "surveyorOrder": 0,
"stateOrder": 1, "districtOrder": 2, "blockOrder": 3,
"gramPanchayatOrder": 34, "villageOrder": 35, "hamletOrder": 0,
// ↑ orders of the GEOGRAPHY questions in this form (0 = not present).
"keyInfoOrders": ["53","57"], // orders shown on list/card view
"searchOrders": ["53","57"], // orders searchable in list view
"duplicateCheckQuestions": [], // orders that form a unique-key tuple
"maskingConfig": [], // PII masking rules (empty in dump)
"masterConfig": [], // legacy? empty in every form
"createDynamicOption": [...], // "I publish options to other forms"
"getDynamicOption": [...], // "I pull options from other forms"
"language": [
{
"lng": "en",
"title": "2. Member Profiling",
"buttons": { ... }, // localized button labels
"question": [ // 🔑 the actual form
{ ...question shape... }
]
}
]
}
Question shape
{
"_id": "66e9122487cda931cb1809a4",
"order": "1", // 🔑 the stable identifier
"shortKey": "state", // optional, often "orderNN" or human slug
"label": "2", // display number ("2.")
"viewSequence": "2", // visual order (may differ from "order")
"title": "State", // localized question text
"hint": "",
"input_type": "3", // 🔑 see §3
"answer_option": [ // for select-types only
{ "_id": "1", "name": "Aadhaar", "did": [], "shortKey": "" }
],
"validation": [ { "_id":"...", "error_msg":"", "condition": null } ],
"restrictions": [ { "orders": [...], ... } ],
"child": [ { "type":"3", "value":"^([1])$", "order":"49" } ], // skip-logic — see §4
"parent": [ { "order":"19", "type":"22", "value":"^([1])$" } ], // visibility — see §4
"isToBeEncrypted": false, // PII flag
"editable": false, // can the value be edited after first save
"weightage": [], // scoring (unused in this dump)
"information": "", "informationType": "text",
"information_urls": [], "convertedInformationUrls": [], "resource_urls": []
}
The order field is the source of truth for everything. parent, child, child[].order, getDynamicOption.dataOrdersMapping.fromOrder/toOrder, keyInfoOrders, searchOrders, stateOrder, etc. all reference questions by numeric order. Lose the order mapping and the import is dead.
3. input_type codes
mForm uses small numeric strings as type codes. Mapping deduced from sample questions across all 13 forms:
| Code | Meaning (deduced) | Frappe target | Notes |
|---|---|---|---|
1 | Text | Data | ”Enter Name” |
2 | Number | Int / Float | ”Annual Income” |
3 | Hierarchical select / Geography | Link | ”State”/“District” — empty answer_option, options resolved at runtime via system geography master. Has parent for cascading. |
4 | Multi-select (checkboxes) | Table MultiSelect or comma-string | ”Which documents do you have?” with 12 inline answer_option. |
5 | Single-select (radio) | Select or Link | ”Gender” — Male/Female/Trans, inline options. |
7 | Image upload | Attach Image | ”Upload Image for ID 1” |
10 | Section header | (no field; Section Break) | label is empty, title is the section name. This is how forms get sectioned. |
11 | File / document attach | Attach | ”Please attach the Disability Certificate” |
13 | Unconfirmed | TBD | Present in distribution count but not in sampled forms — investigate 1002/1004/1010 before pipeline. |
14 | Date | Date | ”Enter Date of Birth” |
22 | Consent / yes-no with copy | Select (2 options) | “Consent Form” — Agree/Disagree, with rich information body. |
27 | Text — typeahead / freeform name | Data | ”Individual Facilitator” — no answer_option, behaves like text. |
29, 30 | Unconfirmed | TBD | Each present in 2 questions across all forms — likely advanced types (signature? rating? location?). Probe before pipeline. |
Counts across all 13 forms: type 1 ×12, 3 ×11, 4 ×10, 5 ×8, 2 ×8, 14 ×8, 10 ×8, 7 ×5, 27 ×5, 11 ×4, 13 ×3, 30 ×2, 29 ×2, 22 ×2.
The distribution
headtotals only the distinct-types-per-form count, not the per-question count. Either way, codes 13/29/30 are rare and need a manual sample before mapping — the import pipeline must fail loudly if it sees an unmapped type.
4. Logic: skip, validation, prefill
Skip / visibility — parent and child
Every question has both parent (who controls my visibility) and child (whom I control). The encoding is symmetric:
// Question 1 (order="1"): "State"
"parent": [ { "order": "19", "type": "22", "value": "^([1])$" } ]
// ↑ "show me when question of order 19 (a type-22 consent question) has value '1' (Agree)"
// Question 19 (order="19"): "Consent Form"
"child": [
{ "type": "3", "value": "^(1)$", "order": "1" }, // controls State
{ "type": "3", "value": "^(1)$", "order": "2" }, // controls District
{ "type": "1", "value": "^(1)$", "order": "4" }, // controls Name
...
]
Important: value is a regex (anchored with ^...$), not an equality. The pipeline must preserve regex semantics — most are simple ^([1])$ but some are ^(1|2|3)$, etc. Don’t convert blindly to value == "1".
Validation
"validation": [
{ "_id": "55.1", "error_msg": "", "value": "", "condition": null },
{ "_id": "3.2", "error_msg": "", "condition": null }
]
In this dump: every validation[] entry has condition: null and empty error_msg. Validation rules are not encoded inline in the export we have. Either (a) the validation engine derives rules from input_type + restrictions alone, or (b) validation lives in a separate config not exported. Confirm with mForm team before relying on import-time validation.
Restrictions
restrictions is non-empty on some questions and contains orders[] — likely the orders that this question’s value cannot duplicate (e.g., “this name field can’t equal the value already entered in question 12”). Sample shape only — the few non-empty examples in this dump have { "orders": [...] }. Treat as unconfirmed — sample more before mapping.
Prefill — dataOrdersMapping
The cleanest piece of logic in the schema. Inside getDynamicOption[].dataOrdersMapping:
{
"fromOrder": "21", // pull the value from this order in the SOURCE form
"toOrder": "12" // and prefill it into this order in the CURRENT form
}
Example (form 1003 Scheme Followup pulling from form 1002 Scheme Application):
fromOrder=21 → toOrder=12(status field carries forward from the application to the followup).
This is exactly the mechanism that maps to Frappe’s Fetch From field property — but it’s keyed on numeric orders, so we need a (formId, order) → fieldname registry generated alongside the Doctypes.
5. Cross-form references — getDynamicOption & createDynamicOption
The whole reference graph for the 13 forms:
| Form | Pulls options from | Filter | Notes |
|---|---|---|---|
| 1000 Member Profile | 1009 Donor Master | GEOGRAPHY | Donor list scoped to member’s geography. |
| 1002 Scheme Application | 1000 Member, 1001 Scheme Master | USER / NONE | Member picker = surveyor’s members; Scheme master = global. |
| 1003 Scheme Followup | 1002 Scheme App, 1003 (self) | USER / USER | ★ self-link — see §6. |
| 1004 Document Application | 1000 Member, 1008 Doc Master | USER / NONE | Same shape as Scheme. |
| 1005 Doc App Follow-up | 1004 Doc App, 1005 (self) | USER / USER | ★ self-link |
| 1010 Health Screening | 1000 Member | USER | No global health master; HS is self-contained. |
| 1011 HS Follow-up | 1010 HS, 1011 (self) | USER / USER | ★ self-link. Has closedItself terminal-status list. |
| 1012 up-test | 1009 Donor Master | GEOGRAPHY | Test form. |
filterBy values seen: USER (scope to the current surveyor’s responses), GEOGRAPHY (scope to surveyor’s assigned geographies), NONE (global master). Confirms the kickoff finding that scoping is per-form (Scheme/Document = USER, Health = USER but should be GEOGRAPHY post-V3).
createDynamicOption is the inverse: it declares “responses from this form become options elsewhere” with parentOrder (which order to expose), optionIdentifier (what field is the option label), and conditions[] (only expose when those conditions match — e.g. “expose schemes 1..12 to form 3 only”).
6. ★ The self-linking pattern (kickoff D3 — replace with audit log)
Detection rule
form.getDynamicOption[*].formId == form.publishedFormId⇒ this form self-links.
Confirmed in 1003, 1005, 1011. Each has two getDynamicOption entries: one for the parent (e.g., 1003 → 1002 Scheme App) and one for self (1003 → 1003).
Self-link entry for 1003 Scheme Followup
{
"formId": 1003, // ← itself
"filterBy": ["USER"],
"orderToDisplayIn": "11", // a "previous follow-up" picker at order 11
"isReusuable": false, // can't re-pick the same prior follow-up
"isPrimary": false,
"dataOrdersMapping": [ // copy these fields forward
{ "fromOrder": "16", "toOrder": "25" },
{ "fromOrder": "18", "toOrder": "26" },
{ "fromOrder": "20", "toOrder": "27" }
],
"closedItself": [] // 1003 has no terminal-status list
}
So a fresh 1003 response = pick the previous follow-up at order 11 → orders 16/18/20 from that previous follow-up auto-fill into the new response’s orders 25/26/27.
Terminal statuses for 1011 HS Follow-up
"closedItself": [
{ "or": [
{ "order": "22", "value": { "eq": "2" } },
{ "order": "22", "value": { "eq": "4" } },
{ "order": "22", "value": { "eq": "5" } },
{ "order": "22", "value": { "eq": "12" } },
{ "order": "22", "value": { "eq": "13" } },
{ "order": "22", "value": { "eq": "14" } }
]}
]
“Stop allowing further follow-ups when the previous follow-up’s order-22 (status field) equals 2, 4, 5, 12, 13, or 14.”
These are the terminal status codes the kickoff brief mentions abstractly. Capture them now — when collapsing self-linked responses into a Frappe audit log, the import pipeline should mark the parent record as is_closed=true exactly when the latest log entry’s status is in this set.
Import-time collapse (the actual D3 transformation)
For each ★ form’s response set:
- Group all responses by
parentResponseId(the response in the picker atorderToDisplayIn). - The chain root = the response that picks
null(first follow-up); subsequent rows form a linked list via theclosedItself-pickable order. - Output: one Frappe record per application (the parent in the non-self entry — e.g., 1002 Scheme Application’s response), and one audit-log child row per follow-up response — capturing only the orders that change between consecutive follow-ups.
- Mark the application as closed when the latest follow-up’s status hits the
closedItselfset.
7. Geography questions
In every form, six top-level fields tell you which questions hold geography:
stateOrder, districtOrder, blockOrder, gramPanchayatOrder, villageOrder, hamletOrder
0 = “not present in this form”. Example for 1000: state=1, district=2, block=3, GP=34, village=35, hamlet=0 — geography is split across the form (basic three at the top, finer two further down). Geography questions have input_type=3, empty answer_option, and a parent that hierarchically constrains the next level (district parent = state, etc.).
The import pipeline should resolve geography option values to canonical IDs from a Frappe
GeographyDoctype hierarchy — these are not in the form export; they’re a runtime master.
Geography fields on child forms (1002, 1003, 1004, 1005, 1010, 1011) get prefilled from the member via getDynamicOption.dataOrdersMapping — which is exactly P1/P2 of the kickoff brief. The fact that this works “when the surveyor has 1 geography but breaks with multiple” is a runtime bug in mForm, not a data shape problem — V3 just needs to honour the same dataOrdersMapping semantics correctly.
8. Master data references
Three masters in this dump: 1001 Scheme Master, 1008 Document Master, 1009 Donor Master. They look like ordinary forms (low question count, mostly text/select), and other forms reference them via getDynamicOption with filterBy: NONE (global) or GEOGRAPHY.
What’s NOT in the dump: the master data. The Scheme Master is just a 12-question form definition; the 107 actual schemes (PM Jan Dhan, MGNREGA, etc.) live as responses to that form, not as part of this export. To import schemes/documents/donors we need the response data, not just the form definition.
9. Multi-language
language is a list, sized 1, with lng="en" only — across all 13 forms.
The kickoff brief lists English / Kannada / Telugu. Either the export filtered to en, or the translations live in a system-level translation table. Action: confirm with the mForm team where Kannada and Telugu strings live and whether they can be exported parallel to the form JSON. Don’t design the import pipeline assuming language[] will magically grow.
10. Versioning / IDs / preservation
_id(24-hex Mongo ObjectID) on the form, every question, every validation, everygetDynamicOption, everydataOrdersMapping. Preserve at import — these are the join keys to map response data later.formIdvspublishedFormId— both stable, both should be carried into Frappe (e.g., asmform_internal_idandmform_published_idfields on the Doctype).- No
created_at/updated_atin the form export — versioning is implicit. If the V2 instance stores form revisions, that’s a separate API.
11. Surprises / quirks / risks
isMasteris always false — even for forms with “Master” in the title. Don’t trust this field. Use the title or the dependency direction.- Numeric orders as strings.
"order": "1"not1. JSON-Schema-style numeric comparisons will silently fail; the pipeline must coerce. - Regex-encoded skip logic.
value: "^([1])$"is a regex, not a string. Trivial-looking but non-trivial when value is^(1|2|3)$or^([1][0])$(note: that’s “10”, anchored). String equality will mis-evaluate; use a regex engine for the runtime, or pre-parse to normalised form at import. - Validation rules are empty. Every
validation[]entry hascondition: nullin this dump. Either validation lives elsewhere or onlyinput_type-implicit rules are enforced. Don’t assume the import preserves validation — confirm with mForm team. up-testform (1012) is in the bundle. Filter it out at import time. Its formId=14 (note the gap; formId=13 was deleted at some point — that’s why there are 13 files but the highest formId is 14).- Master form data is not in this dump. 1001/1008/1009 are form definitions; the schemes/docs/donors stored as responses to those forms must come via a separate response-data export. The probe script (A15) should fetch a sample.
- Three input types unmapped (
13,29,30). Few instances each. Pipeline must fail loudly on unmapped types — silent default-to-Data is a long-tail data-loss bug. - Multi-select
4semantics. Inlineanswer_optionarray, but no documentation in the JSON about whether the stored value is"1,3,5"or["1","3","5"]. Confirm with response data. isCustomLabel,informationType,convertedInformationUrls— all present, all uniform, no documentation. If unused, drop at import; if used, capture meaning before designing the schema.profileForm(1006) andfilterForm(1007) look like UI configuration, not data forms — 2 and 3 questions respectively. Likely tell mForm “which fields to show on the member card / list filter.” Re-implement as FrappeList View Settings/ dashboard config; do not import as Doctypes.closedCreator/matchWithCreatoringetDynamicOption— present but empty in every entry. Likely permission gates (“only the creator can see their own pickable options”). Sample more before designing the equivalent.- Health Screening (
1010) has 90 questions and 11 sections; HS Follow-up (1011) has 97 questions and 10 sections. These are by far the largest forms — generate the Doctype, then pause and review naming/sectioning by hand before bulk-importing responses.
12. Mapping table → Frappe Doctype
Proposed top-level Doctype shape per form. All Doctypes get the same migration metadata fields:
mform_published_id Int (== publishedFormId)
mform_internal_id Int (== formId)
mform_response_id Data (Mongo ObjectID of source response, unique key for idempotent re-import)
mform_created_at Datetime (from source response, not form)
imported_at Datetime (when our pipeline wrote the row)
| mForm form | Frappe Doctype | Type | Key parent links | Notes |
|---|---|---|---|---|
| 1000 Member Profiling | Swasti Member | Document | (none — root) | 56 fields, 1 section. The master record. |
| 1001 Scheme Master | Swasti Scheme | Document (Master) | (none) | One row per scheme. Eligibility filters become hooks/queries on the application Doctype, not Doctype fields. |
| 1002 Scheme Application | Swasti Scheme Application | Document | member → Swasti Member, scheme → Swasti Scheme | 43 fields, 5 sections. No “follow-up” subtable yet; see 1003. |
| 1003 Scheme Followup | (collapses into 1002) | child Doctype: Swasti Scheme Followup Log | parent → Scheme Application | ★ self-link → audit log. N self-linked 1003 responses become N child rows under the parent application. Map dataOrdersMapping to fetch_from. |
| 1004 Document Application | Swasti Document Application | Document | member → Swasti Member, document → Swasti Document | 39 fields, 5 sections. |
| 1005 Doc App Follow-up | (collapses into 1004) | child Doctype: Swasti Document Followup Log | parent → Document Application | ★ self-link → audit log. |
| 1006 Profile Form | (drop — UI config) | n/a | — | Re-implement as List View Settings. |
| 1007 Filter Form | (drop — UI config) | n/a | — | Re-implement as List View / dashboard filter. |
| 1008 Document Master | Swasti Document | Document (Master) | (none) | One row per document type. |
| 1009 Donor Master | Swasti Donor | Document (Master) | (none, but geography-scoped) | Geography filter at runtime, not a static parent link. |
| 1010 Health Screening | Swasti Health Screening | Document | member → Swasti Member | 90 fields, 11 sections — biggest form. |
| 1011 HS Follow-up | (collapses into 1010) | child Doctype: Swasti Health Screening Followup Log | parent → Health Screening | ★ self-link → audit log. Six terminal statuses (codes 2/4/5/12/13/14 on order 22) close the chain. |
| 1012 up-test | (skip — test form) | — | — |
Field-by-field. The conversion is mechanical given the order→fieldname registry:
text(1)→Data;number(2)→IntorFloat(sniff from data);date(14)→Date.select(5)with inline options →Selectwith newline-separated options.multi-select(4)with inline options → child tableSwasti <Form> <Question> MultiwithoptionData field, OR comma-string + UI control (decide once, project-wide).hierarchical(3)→LinktoSwasti Geography(single Doctype, parent-child Tree). Pre-fill viaFetch Fromon dependent levels.image(7)→Attach Image.attach(11)→Attach.consent(22)→Select(Agree/Disagree) with the consent body stored on a config Doctype (Swasti Consent Block), referenced by ID — don’t paste consent prose into every Doctype field.section(10)→Section Break(label = the section title).parent[]→ Doctype fielddepends_on(translate the regex to a JS expression).child[]→ derivable fromparent[]; don’t double-store.dataOrdersMapping→Fetch Fromon the target field, source path<parent_doctype>.<source_fieldname>.
13. Open questions for the import pipeline
- Where do Kannada / Telugu translations live? Not in this dump. Without them, the V3 mobile app launches English-only.
- Where does master data come from? The Scheme Master form (1001) is here, but the 107 schemes themselves are responses — not in this dump.
- Where do response data live and how do we fetch them? REST API, DB export, both? See A14.
- What’s the encoding for multi-select stored values?
"1,3,5"or["1","3","5"]? - What do
input_typecodes 13, 29, 30 mean? Sample and confirm before designing field mapping. - Are validation rules really empty, or is the export filtering them out?
- What does
closedCreator/matchWithCreatoractually do? - Are there form revisions / version history we need to preserve, or only the current published form?
- Does the system geography master have stable IDs across V2 and V3? If not, we need an ID-mapping pass.
- What’s the canonical list of terminal statuses for Scheme Followup (1003)?
closedItselfis empty — confirm with the team whether 1003 truly has no terminal-status closure, or whether it’s enforced in app logic.