Swasti · mForm V2→V3

mForm V2 form JSON — shape analysis

Companion to 01-kickoff-brief.md §10 (the import pipeline). This doc is for the dev who will write A15 (probe script) and A16 (import pipeline). Source files live at ~/Dhwani/swasti-mform-migration/raw/mform-forms/ (extracted from swasti_forms_json.zip).

TL;DR — the three things that matter most

  1. Self-linking is getDynamicOption with formId == own publishedFormId. Forms 1003, 1005, 1011 each contain a getDynamicOption entry pointing at themselves. The dataOrdersMapping inside that entry says which fields are copied forward from the previous follow-up to the new one (i.e., how the response chain is glued together). The closedItself array on form 1011 contains the terminal statuses that stop self-linking — this is the explicit list we need to preserve when we collapse N self-linked responses into one Frappe record + N audit-log rows.
  2. There is no separate “form name” field. The human-readable title ("3.2 Scheme Followup", etc.) is buried at language[0].title. Question text is at language[i].question[].title. Everywhere else in the form, questions are referenced by their numeric order — not by _id, not by shortKey. Order is a string. A migration that loses the order-to-Doctype-fieldname mapping is unrecoverable.
  3. Languages: only en was exported. All 13 forms have language length 1 with lng="en". The Kannada/Telugu copy lives elsewhere (probably a system-level translations table not in this dump). Confirm with the mForm team before assuming we have the full picture for §P7 of the kickoff brief.

1. Inventory

13 files, all flat at the root of the zip, named <publishedFormId>.json. Two formIds — internal (formId, monotonic 1–14, gap at 13) and published (publishedFormId, the file name). All forms are isMaster=false even though some are clearly masters (Scheme Master, Document Master, Donor Master) — isMaster is unreliable; trust the title.

FileformIdpubIdTitleQsSections (input_type=10)
1000.json110002. Member Profiling561
1001.json210011.1 Scheme Master120
1002.json310023.1 Scheme Application435
1003.json410033.2 Scheme Followup ★self-link379
1004.json510044.1 Document Application395
1005.json610054.2 Document Application Follow-up ★self-link399
1006.json71006Profile Form20
1007.json81007Filter Form30
1008.json910081.2 Document Master20
1009.json101009Donor Master60
1010.json1110105.1 Health Screening9011
1011.json1210115.2 Health Screening Follow-up ★self-link9710
1012.json141012up-test571

Bold = main programs from the kickoff. ★ = forms that self-link (response-sprawl pattern → must collapse to audit log on import).

Missing from this dump

  • No system geography master (state/district/block/GP/village data) — geography options are pulled from a system-level master not included.
  • No translations for Kannada/Telugu.
  • No system master for surveyors/users.
  • No Livelihood program forms (the 4th, new program) — expected, since it isn’t on V2 yet.
  • up-test (1012) is a developer test form — exclude from production import.

2. Top-level form schema

{
  "_id": "66e912180388f75534092900",      // Mongo ObjectID
  "formId": 1,                              // internal, monotonic per project
  "publishedFormId": 1000,                  // public ID == file name
  "publish": true, "hold": false, "isLocked": false,
  "isMaster": false,                        // unreliable, see §1
  "isCustomLabel": false,
  "repairImportedQuestion": true,
  "formType": null,
  "projectOrder": 0, "reportedDateOrder": 0, "surveyorOrder": 0,
  "stateOrder": 1, "districtOrder": 2, "blockOrder": 3,
  "gramPanchayatOrder": 34, "villageOrder": 35, "hamletOrder": 0,
  // ↑ orders of the GEOGRAPHY questions in this form (0 = not present).

  "keyInfoOrders": ["53","57"],             // orders shown on list/card view
  "searchOrders":   ["53","57"],             // orders searchable in list view
  "duplicateCheckQuestions": [],            // orders that form a unique-key tuple

  "maskingConfig": [],                      // PII masking rules (empty in dump)
  "masterConfig":  [],                      // legacy?  empty in every form

  "createDynamicOption": [...],             // "I publish options to other forms"
  "getDynamicOption":    [...],             // "I pull options from other forms"
  "language": [
    {
      "lng": "en",
      "title": "2. Member Profiling",
      "buttons": { ... },                   // localized button labels
      "question": [                         // 🔑 the actual form
        { ...question shape... }
      ]
    }
  ]
}

Question shape

{
  "_id": "66e9122487cda931cb1809a4",
  "order": "1",                  // 🔑 the stable identifier
  "shortKey": "state",            // optional, often "orderNN" or human slug
  "label":     "2",               // display number ("2.")
  "viewSequence": "2",            // visual order (may differ from "order")
  "title":  "State",              // localized question text
  "hint":   "",
  "input_type": "3",              // 🔑 see §3
  "answer_option": [              // for select-types only
    { "_id": "1", "name": "Aadhaar", "did": [], "shortKey": "" }
  ],
  "validation":   [ { "_id":"...", "error_msg":"", "condition": null } ],
  "restrictions": [ { "orders": [...], ... } ],
  "child":        [ { "type":"3", "value":"^([1])$", "order":"49" } ], // skip-logic — see §4
  "parent":       [ { "order":"19", "type":"22", "value":"^([1])$" } ], // visibility — see §4
  "isToBeEncrypted": false,        // PII flag
  "editable": false,               // can the value be edited after first save
  "weightage": [],                 // scoring (unused in this dump)
  "information": "", "informationType": "text",
  "information_urls": [], "convertedInformationUrls": [], "resource_urls": []
}

The order field is the source of truth for everything. parent, child, child[].order, getDynamicOption.dataOrdersMapping.fromOrder/toOrder, keyInfoOrders, searchOrders, stateOrder, etc. all reference questions by numeric order. Lose the order mapping and the import is dead.


3. input_type codes

mForm uses small numeric strings as type codes. Mapping deduced from sample questions across all 13 forms:

CodeMeaning (deduced)Frappe targetNotes
1TextData”Enter Name”
2NumberInt / Float”Annual Income”
3Hierarchical select / GeographyLink”State”/“District” — empty answer_option, options resolved at runtime via system geography master. Has parent for cascading.
4Multi-select (checkboxes)Table MultiSelect or comma-string”Which documents do you have?” with 12 inline answer_option.
5Single-select (radio)Select or Link”Gender” — Male/Female/Trans, inline options.
7Image uploadAttach Image”Upload Image for ID 1”
10Section header(no field; Section Break)label is empty, title is the section name. This is how forms get sectioned.
11File / document attachAttach”Please attach the Disability Certificate”
13UnconfirmedTBDPresent in distribution count but not in sampled forms — investigate 1002/1004/1010 before pipeline.
14DateDate”Enter Date of Birth”
22Consent / yes-no with copySelect (2 options)“Consent Form” — Agree/Disagree, with rich information body.
27Text — typeahead / freeform nameData”Individual Facilitator” — no answer_option, behaves like text.
29, 30UnconfirmedTBDEach present in 2 questions across all forms — likely advanced types (signature? rating? location?). Probe before pipeline.

Counts across all 13 forms: type 1 ×12, 3 ×11, 4 ×10, 5 ×8, 2 ×8, 14 ×8, 10 ×8, 7 ×5, 27 ×5, 11 ×4, 13 ×3, 30 ×2, 29 ×2, 22 ×2.

The distribution head totals only the distinct-types-per-form count, not the per-question count. Either way, codes 13/29/30 are rare and need a manual sample before mapping — the import pipeline must fail loudly if it sees an unmapped type.


4. Logic: skip, validation, prefill

Skip / visibility — parent and child

Every question has both parent (who controls my visibility) and child (whom I control). The encoding is symmetric:

// Question 1 (order="1"): "State"
"parent": [ { "order": "19", "type": "22", "value": "^([1])$" } ]
//          ↑ "show me when question of order 19 (a type-22 consent question) has value '1' (Agree)"

// Question 19 (order="19"): "Consent Form"
"child": [
  { "type": "3", "value": "^(1)$", "order": "1"  },  // controls State
  { "type": "3", "value": "^(1)$", "order": "2"  },  // controls District
  { "type": "1", "value": "^(1)$", "order": "4"  },  // controls Name
  ...
]

Important: value is a regex (anchored with ^...$), not an equality. The pipeline must preserve regex semantics — most are simple ^([1])$ but some are ^(1|2|3)$, etc. Don’t convert blindly to value == "1".

Validation

"validation": [
  { "_id": "55.1", "error_msg": "", "value": "", "condition": null },
  { "_id": "3.2",  "error_msg": "", "condition": null }
]

In this dump: every validation[] entry has condition: null and empty error_msg. Validation rules are not encoded inline in the export we have. Either (a) the validation engine derives rules from input_type + restrictions alone, or (b) validation lives in a separate config not exported. Confirm with mForm team before relying on import-time validation.

Restrictions

restrictions is non-empty on some questions and contains orders[] — likely the orders that this question’s value cannot duplicate (e.g., “this name field can’t equal the value already entered in question 12”). Sample shape only — the few non-empty examples in this dump have { "orders": [...] }. Treat as unconfirmed — sample more before mapping.

Prefill — dataOrdersMapping

The cleanest piece of logic in the schema. Inside getDynamicOption[].dataOrdersMapping:

{
  "fromOrder": "21",   // pull the value from this order in the SOURCE form
  "toOrder":   "12"    // and prefill it into this order in the CURRENT form
}

Example (form 1003 Scheme Followup pulling from form 1002 Scheme Application):

  • fromOrder=21 → toOrder=12 (status field carries forward from the application to the followup).

This is exactly the mechanism that maps to Frappe’s Fetch From field property — but it’s keyed on numeric orders, so we need a (formId, order) → fieldname registry generated alongside the Doctypes.


5. Cross-form references — getDynamicOption & createDynamicOption

The whole reference graph for the 13 forms:

FormPulls options fromFilterNotes
1000 Member Profile1009 Donor MasterGEOGRAPHYDonor list scoped to member’s geography.
1002 Scheme Application1000 Member, 1001 Scheme MasterUSER / NONEMember picker = surveyor’s members; Scheme master = global.
1003 Scheme Followup1002 Scheme App, 1003 (self)USER / USER★ self-link — see §6.
1004 Document Application1000 Member, 1008 Doc MasterUSER / NONESame shape as Scheme.
1005 Doc App Follow-up1004 Doc App, 1005 (self)USER / USER★ self-link
1010 Health Screening1000 MemberUSERNo global health master; HS is self-contained.
1011 HS Follow-up1010 HS, 1011 (self)USER / USER★ self-link. Has closedItself terminal-status list.
1012 up-test1009 Donor MasterGEOGRAPHYTest form.

filterBy values seen: USER (scope to the current surveyor’s responses), GEOGRAPHY (scope to surveyor’s assigned geographies), NONE (global master). Confirms the kickoff finding that scoping is per-form (Scheme/Document = USER, Health = USER but should be GEOGRAPHY post-V3).

createDynamicOption is the inverse: it declares “responses from this form become options elsewhere” with parentOrder (which order to expose), optionIdentifier (what field is the option label), and conditions[] (only expose when those conditions match — e.g. “expose schemes 1..12 to form 3 only”).


6. ★ The self-linking pattern (kickoff D3 — replace with audit log)

Detection rule

form.getDynamicOption[*].formId == form.publishedFormId ⇒ this form self-links.

Confirmed in 1003, 1005, 1011. Each has two getDynamicOption entries: one for the parent (e.g., 1003 → 1002 Scheme App) and one for self (1003 → 1003).

Self-link entry for 1003 Scheme Followup

{
  "formId": 1003,                  // ← itself
  "filterBy": ["USER"],
  "orderToDisplayIn": "11",        // a "previous follow-up" picker at order 11
  "isReusuable": false,            // can't re-pick the same prior follow-up
  "isPrimary": false,
  "dataOrdersMapping": [           // copy these fields forward
    { "fromOrder": "16", "toOrder": "25" },
    { "fromOrder": "18", "toOrder": "26" },
    { "fromOrder": "20", "toOrder": "27" }
  ],
  "closedItself": []               // 1003 has no terminal-status list
}

So a fresh 1003 response = pick the previous follow-up at order 11 → orders 16/18/20 from that previous follow-up auto-fill into the new response’s orders 25/26/27.

Terminal statuses for 1011 HS Follow-up

"closedItself": [
  { "or": [
      { "order": "22", "value": { "eq": "2"  } },
      { "order": "22", "value": { "eq": "4"  } },
      { "order": "22", "value": { "eq": "5"  } },
      { "order": "22", "value": { "eq": "12" } },
      { "order": "22", "value": { "eq": "13" } },
      { "order": "22", "value": { "eq": "14" } }
  ]}
]

“Stop allowing further follow-ups when the previous follow-up’s order-22 (status field) equals 2, 4, 5, 12, 13, or 14.”

These are the terminal status codes the kickoff brief mentions abstractly. Capture them now — when collapsing self-linked responses into a Frappe audit log, the import pipeline should mark the parent record as is_closed=true exactly when the latest log entry’s status is in this set.

Import-time collapse (the actual D3 transformation)

For each ★ form’s response set:

  1. Group all responses by parentResponseId (the response in the picker at orderToDisplayIn).
  2. The chain root = the response that picks null (first follow-up); subsequent rows form a linked list via the closedItself-pickable order.
  3. Output: one Frappe record per application (the parent in the non-self entry — e.g., 1002 Scheme Application’s response), and one audit-log child row per follow-up response — capturing only the orders that change between consecutive follow-ups.
  4. Mark the application as closed when the latest follow-up’s status hits the closedItself set.

7. Geography questions

In every form, six top-level fields tell you which questions hold geography:

stateOrder, districtOrder, blockOrder, gramPanchayatOrder, villageOrder, hamletOrder

0 = “not present in this form”. Example for 1000: state=1, district=2, block=3, GP=34, village=35, hamlet=0 — geography is split across the form (basic three at the top, finer two further down). Geography questions have input_type=3, empty answer_option, and a parent that hierarchically constrains the next level (district parent = state, etc.).

The import pipeline should resolve geography option values to canonical IDs from a Frappe Geography Doctype hierarchy — these are not in the form export; they’re a runtime master.

Geography fields on child forms (1002, 1003, 1004, 1005, 1010, 1011) get prefilled from the member via getDynamicOption.dataOrdersMapping — which is exactly P1/P2 of the kickoff brief. The fact that this works “when the surveyor has 1 geography but breaks with multiple” is a runtime bug in mForm, not a data shape problem — V3 just needs to honour the same dataOrdersMapping semantics correctly.


8. Master data references

Three masters in this dump: 1001 Scheme Master, 1008 Document Master, 1009 Donor Master. They look like ordinary forms (low question count, mostly text/select), and other forms reference them via getDynamicOption with filterBy: NONE (global) or GEOGRAPHY.

What’s NOT in the dump: the master data. The Scheme Master is just a 12-question form definition; the 107 actual schemes (PM Jan Dhan, MGNREGA, etc.) live as responses to that form, not as part of this export. To import schemes/documents/donors we need the response data, not just the form definition.


9. Multi-language

language is a list, sized 1, with lng="en" only — across all 13 forms.

The kickoff brief lists English / Kannada / Telugu. Either the export filtered to en, or the translations live in a system-level translation table. Action: confirm with the mForm team where Kannada and Telugu strings live and whether they can be exported parallel to the form JSON. Don’t design the import pipeline assuming language[] will magically grow.


10. Versioning / IDs / preservation

  • _id (24-hex Mongo ObjectID) on the form, every question, every validation, every getDynamicOption, every dataOrdersMapping. Preserve at import — these are the join keys to map response data later.
  • formId vs publishedFormId — both stable, both should be carried into Frappe (e.g., as mform_internal_id and mform_published_id fields on the Doctype).
  • No created_at / updated_at in the form export — versioning is implicit. If the V2 instance stores form revisions, that’s a separate API.

11. Surprises / quirks / risks

  1. isMaster is always false — even for forms with “Master” in the title. Don’t trust this field. Use the title or the dependency direction.
  2. Numeric orders as strings. "order": "1" not 1. JSON-Schema-style numeric comparisons will silently fail; the pipeline must coerce.
  3. Regex-encoded skip logic. value: "^([1])$" is a regex, not a string. Trivial-looking but non-trivial when value is ^(1|2|3)$ or ^([1][0])$ (note: that’s “10”, anchored). String equality will mis-evaluate; use a regex engine for the runtime, or pre-parse to normalised form at import.
  4. Validation rules are empty. Every validation[] entry has condition: null in this dump. Either validation lives elsewhere or only input_type-implicit rules are enforced. Don’t assume the import preserves validation — confirm with mForm team.
  5. up-test form (1012) is in the bundle. Filter it out at import time. Its formId=14 (note the gap; formId=13 was deleted at some point — that’s why there are 13 files but the highest formId is 14).
  6. Master form data is not in this dump. 1001/1008/1009 are form definitions; the schemes/docs/donors stored as responses to those forms must come via a separate response-data export. The probe script (A15) should fetch a sample.
  7. Three input types unmapped (13, 29, 30). Few instances each. Pipeline must fail loudly on unmapped types — silent default-to-Data is a long-tail data-loss bug.
  8. Multi-select 4 semantics. Inline answer_option array, but no documentation in the JSON about whether the stored value is "1,3,5" or ["1","3","5"]. Confirm with response data.
  9. isCustomLabel, informationType, convertedInformationUrls — all present, all uniform, no documentation. If unused, drop at import; if used, capture meaning before designing the schema.
  10. profileForm (1006) and filterForm (1007) look like UI configuration, not data forms — 2 and 3 questions respectively. Likely tell mForm “which fields to show on the member card / list filter.” Re-implement as Frappe List View Settings / dashboard config; do not import as Doctypes.
  11. closedCreator / matchWithCreator in getDynamicOption — present but empty in every entry. Likely permission gates (“only the creator can see their own pickable options”). Sample more before designing the equivalent.
  12. Health Screening (1010) has 90 questions and 11 sections; HS Follow-up (1011) has 97 questions and 10 sections. These are by far the largest forms — generate the Doctype, then pause and review naming/sectioning by hand before bulk-importing responses.

12. Mapping table → Frappe Doctype

Proposed top-level Doctype shape per form. All Doctypes get the same migration metadata fields:

mform_published_id  Int      (== publishedFormId)
mform_internal_id   Int      (== formId)
mform_response_id   Data     (Mongo ObjectID of source response, unique key for idempotent re-import)
mform_created_at    Datetime (from source response, not form)
imported_at         Datetime (when our pipeline wrote the row)
mForm formFrappe DoctypeTypeKey parent linksNotes
1000 Member ProfilingSwasti MemberDocument(none — root)56 fields, 1 section. The master record.
1001 Scheme MasterSwasti SchemeDocument (Master)(none)One row per scheme. Eligibility filters become hooks/queries on the application Doctype, not Doctype fields.
1002 Scheme ApplicationSwasti Scheme ApplicationDocumentmemberSwasti Member, schemeSwasti Scheme43 fields, 5 sections. No “follow-up” subtable yet; see 1003.
1003 Scheme Followup(collapses into 1002)child Doctype: Swasti Scheme Followup Logparent → Scheme Application★ self-link → audit log. N self-linked 1003 responses become N child rows under the parent application. Map dataOrdersMapping to fetch_from.
1004 Document ApplicationSwasti Document ApplicationDocumentmemberSwasti Member, documentSwasti Document39 fields, 5 sections.
1005 Doc App Follow-up(collapses into 1004)child Doctype: Swasti Document Followup Logparent → Document Application★ self-link → audit log.
1006 Profile Form(drop — UI config)n/aRe-implement as List View Settings.
1007 Filter Form(drop — UI config)n/aRe-implement as List View / dashboard filter.
1008 Document MasterSwasti DocumentDocument (Master)(none)One row per document type.
1009 Donor MasterSwasti DonorDocument (Master)(none, but geography-scoped)Geography filter at runtime, not a static parent link.
1010 Health ScreeningSwasti Health ScreeningDocumentmemberSwasti Member90 fields, 11 sections — biggest form.
1011 HS Follow-up(collapses into 1010)child Doctype: Swasti Health Screening Followup Logparent → Health Screening★ self-link → audit log. Six terminal statuses (codes 2/4/5/12/13/14 on order 22) close the chain.
1012 up-test(skip — test form)

Field-by-field. The conversion is mechanical given the order→fieldname registry:

  • text(1)Data; number(2)Int or Float (sniff from data); date(14)Date.
  • select(5) with inline options → Select with newline-separated options.
  • multi-select(4) with inline options → child table Swasti <Form> <Question> Multi with option Data field, OR comma-string + UI control (decide once, project-wide).
  • hierarchical(3)Link to Swasti Geography (single Doctype, parent-child Tree). Pre-fill via Fetch From on dependent levels.
  • image(7)Attach Image. attach(11)Attach.
  • consent(22)Select (Agree/Disagree) with the consent body stored on a config Doctype (Swasti Consent Block), referenced by ID — don’t paste consent prose into every Doctype field.
  • section(10)Section Break (label = the section title).
  • parent[] → Doctype field depends_on (translate the regex to a JS expression).
  • child[] → derivable from parent[]; don’t double-store.
  • dataOrdersMappingFetch From on the target field, source path <parent_doctype>.<source_fieldname>.

13. Open questions for the import pipeline

  1. Where do Kannada / Telugu translations live? Not in this dump. Without them, the V3 mobile app launches English-only.
  2. Where does master data come from? The Scheme Master form (1001) is here, but the 107 schemes themselves are responses — not in this dump.
  3. Where do response data live and how do we fetch them? REST API, DB export, both? See A14.
  4. What’s the encoding for multi-select stored values? "1,3,5" or ["1","3","5"]?
  5. What do input_type codes 13, 29, 30 mean? Sample and confirm before designing field mapping.
  6. Are validation rules really empty, or is the export filtering them out?
  7. What does closedCreator / matchWithCreator actually do?
  8. Are there form revisions / version history we need to preserve, or only the current published form?
  9. Does the system geography master have stable IDs across V2 and V3? If not, we need an ID-mapping pass.
  10. What’s the canonical list of terminal statuses for Scheme Followup (1003)? closedItself is empty — confirm with the team whether 1003 truly has no terminal-status closure, or whether it’s enforced in app logic.

Last updated 2026-05-04