13 — Scheme eligibility (authoritative)
This note specifies how the V3 Scheme Application form surfaces only the schemes a member qualifies for, derived from the PM-supplied Scheme Master CSV (raw/scheme-master-2026-05-20.csv — a dump from V2 production, 186 schemes).
Why this matters: V2’s “Eligible Schemes” picker isn’t a flat list — it’s filtered against the member’s profile. V3 must do the same. The list goes from 186 → typically 30–80 schemes depending on the member’s state / age / gender / caste / occupation / income.
Per Fahim (2026-05-20): “After profiling only applicable schemes are shown.”
This note covers the matching algorithm, the canonicalisation strategy for the document master, the gap analysis vs the current Scheme Master doctype, and the implementation plan.
1. Source data shape
The PM CSV is one-hot encoded — instead of one “states” column with comma-separated values, each state / gender / caste / etc. has its own column whose value is either the category name (this scheme applies) or empty (it doesn’t).
| Family | Columns | Encoding |
|---|---|---|
| State | 8 | State_Kerala, State_Maharashtra, State_Tamil Nadu, State_Madhya Pradesh, State_Andhra Pradesh, State_Karnataka, State_Rajasthan, State_Uttar Pradesh |
| Gender | 3 | Gender_Male, Gender_Female, Gender_Other |
| Caste | 4 | Caste_OBC, Caste_ST, Caste_SC, Caste_General |
| Marital Status | 4 | Marital Status_Single, _Widowed, _Married, _Divorced |
| Occupation | 52 | Occupation_Farmer, Occupation_Student, … (full list in CSV) |
| Documents required | 165 | Select Documents_Aadhaar Card, etc. |
| Age | 1 regex | Age Rule (Regex) — applied against str(age_in_years) |
| Income | 1 regex | Income Rule — applied against str(annual_income_INR) |
Cell semantics: non-empty cell = scheme applies to that category. Empty = doesn’t apply. A scheme that should apply to all four castes has all four caste columns filled with the category name.
2. Matching algorithm
A scheme S is applicable to member M if ALL of the following hold:
M.state ∈ S.states
M.gender ∈ S.genders
M.caste ∈ S.castes
M.marital_status ∈ S.marital_statuses
M.occupation ∈ S.occupations
regex(S.age_rule, str(M.age)) matches
regex(S.income_rule, str(M.annual_income)) matches
Empty regex or .* = no filter on that dimension (always passes).
Empty-family semantics (asked Fahim 2026-05-20, response pending): if a scheme has zero values filled in a family, do we exclude or include all? Our current reading: exclude. Will lock once confirmed.
Pseudo-SQL (Frappe-side query):
SELECT name, scheme_name, category, description, scheme_link
FROM `tabScheme Master` s
WHERE
EXISTS (SELECT 1 FROM `tabScheme Eligibility State` WHERE parent = s.name AND state = :member_state)
AND EXISTS (SELECT 1 FROM `tabScheme Eligibility Gender` WHERE parent = s.name AND gender = :member_gender)
AND EXISTS (SELECT 1 FROM `tabScheme Eligibility Caste` WHERE parent = s.name AND caste = :member_caste)
AND EXISTS (SELECT 1 FROM `tabScheme Eligibility Marital` WHERE parent = s.name AND marital_status = :member_marital)
AND EXISTS (SELECT 1 FROM `tabScheme Eligibility Occupation` WHERE parent = s.name AND occupation = :member_occupation)
AND (s.age_rule = '.*' OR :member_age REGEXP s.age_rule)
AND (s.income_rule = '.*' OR :member_income REGEXP s.income_rule)
The current Scheme Master doctype stores these as Select (single-value) fields — won’t fit the multi-value reality. See gap analysis below.
3. Data-coverage stats (V2 production, current state)
| Dimension | Coverage |
|---|---|
| State | RJ 66%, KA 65%, TN 64.5%, UP 64.5%, AP 63%, MP 61%; Kerala 1%, Maharashtra 0% |
| Gender | Female 99.5%, Male 87%, Other 86% |
| Caste | ST 99.5%, OBC/SC/General 99% each |
| Marital Status | Single 98%, Widowed 96%, Married 96%, Divorced 95% |
| Occupation | Farmer 79%, Student 72%, 50+ others with long-tail coverage |
| Age | 107/186 no filter; common bands 18-50, 18-60, 60+, 18-119 |
| Income | 140/186 no filter; common caps ₹6L, ₹3L, ₹5L, ₹1L |
These are the live numbers from V2 prod — not a draft, not incomplete. A Maharashtra member in V2 today sees an empty applicable-schemes list. V3 inherits that behaviour until the client tags Maharashtra-applicable schemes.
Universal schemes (no filters at all): 0/186. Every scheme has at least one filter — usually the state list.
4. Age & income regex patterns
PM expresses age/income limits as regex matched against the string form of the number.
| Regex | Means |
|---|---|
.* | no filter |
^(1[89]$|^[2-4][0-9]$|^50)$ | 18 ≤ age ≤ 50 |
^(1[8-9]|[2-6][0-9]|70)$ | 18 ≤ age ≤ 70 |
^(6[0-9]$|^[7-9][0-9]$|^1[01][0-9]$|^120)$ | age ≥ 60 |
^(0|[1-9][0-9]{0,4}|[1-5][0-9]{5}|600000)$ | annual income ≤ ₹6L |
^([0-9]{1,5}|[1-4][0-9]{5}|500000)$ | annual income ≤ ₹5L |
For implementation we store both: the raw regex (canonical from PM) and pre-derived age_min, age_max, income_max columns populated at import time. The derived columns drive the indexed query; the raw regex is the audit trail. If derivation fails for a future weird regex, fall back to REGEXP for that scheme.
5. Document master — single source of truth
Documents are referenced from three places — Member Profiling (which IDs/docs the member possesses), Scheme Application (required docs to apply), Document Application (which document the surveyor is helping the member get). All three use the same Document Master doctype. One canonical name per document, used everywhere.
Today’s state — two parallel masters:
| Doctype | Rows | Used by |
|---|---|---|
Civic Document Type | 12 | Member Profiling b_prof_doc |
Document Master | 32 | Document Application |
8 documents overlap with mostly identical naming (Bank Account, Birth Certificate, Community Certificate, Death Certificate, Domicile Certificate, Family ID, Income Certificate, PAN Card, Passport, Ration Card, Voter ID). One naming conflict: Aadhaar (Civic) vs Aadhar Card (Document Master). Government-of-India canonical is Aadhaar Card — we adopt that.
Target state — single Document Master:
| Field | Type | Purpose |
|---|---|---|
document_name | Data (autoname) | Canonical Title-Case name, e.g. Aadhaar Card |
description | Small Text | Optional, surveyor-readable explanation |
legacy_mongo_id | Data | Idempotent re-import key |
No category field on Document Master. The master is the canonical name registry. Each form declares its own visible subset — surveyors don’t see all 184 docs in one form.
| Form | Visible subset | How it’s defined |
|---|---|---|
Member Profiling b_prof_doc | 12 civic IDs (Aadhaar Card, PAN Card, Voter ID, …) | Hardcoded in kMemberFlow schema; names match Document Master rows |
Scheme Application required_documents | Per-scheme list (typically 3–8 documents) | Loaded from the Scheme Master row’s required_documents child rows — populated from the V2 CSV Select Documents_* columns at import time |
| Document Application | Application-specific subset (~32 docs today) | Existing field, references Document Master |
All three reference the same canonical names — so a member who marked Aadhaar Card in the Member form, a scheme that requires Aadhaar Card, and a Document Application for Aadhaar Card all join cleanly on the same string. That’s the entire point of unifying the master.
Canonicalisation rules (Title Case, applied in importer):
| From | To |
|---|---|
Aadhaar / Aadhar Card | Aadhaar Card |
Disability certificate / Disability Certificate | Disability Certificate |
Bhamashah Card (dup row) | single row |
Email ID (dup row) | single row |
Pan Card | PAN Card |
Ration card | Ration Card |
Birth certificate / Domicile certificate (case only) | Title Case |
| All other 155+ scheme-only docs | Title Case, no trailing whitespace |
Migration plan (executed in a single seeder pass):
- Run a unification importer that:
- Upserts the 12
Civic Document Typerows intoDocument Masterusing canonical names (e.g.Aadhaar→Aadhaar Card). - Normalises the 32 existing
Document Masterrows to canonical names. - Reads the 165 scheme-master document column names, normalises to canonical, upserts into
Document Master. - Final expected size: ~184 unique documents (32 from current Document Master, 12 from Civic Document Type — all already in DM — and 152 net-new from the scheme CSV).
- Upserts the 12
- Update the Member form schema (
b_prof_doc) inkMemberFlowto use canonical names (e.g.Aadhaar→Aadhaar Card). The hardcoded 12-item list stays — the names just align with Document Master rows so a member’s selection maps cleanly to scheme requirements and downstream Document Applications. - Wire the Scheme Master doctype’s
required_documentschild table as a Link →Document Master. The scheme master CSV importer populates this from the per-rowSelect Documents_*columns, mapping each raw column name to its canonical Document Master row. - Migrate existing Member rows that stored the old name — one-time data fix in the Frappe DB and the mobile SQLite cache (
Aadhaar→Aadhaar Card). - Deprecate
Civic Document Type(drop the doctype after the migration; remove from the Mobile Configuration sync list).
The Member form’s picker still shows the same 12 items — surveyor experience doesn’t change. But the canonical name now flows cleanly through: a member who marked they have Aadhaar Card, a scheme that requires Aadhaar Card, and a Document Application for Aadhaar Card all join on the same string.
6. Gap analysis vs current Scheme Master doctype
Current mform_swasti Scheme Master doctype has scalar fields (Select for one value each):
state (Link to State) — single state only
gender (Select) — single gender only
caste (Select) — single caste only
marital_status (Select) — single marital_status only
occupation (Select) — single occupation only
age_rule (Data) — free-text regex
income_rule (Data) — free-text regex
select_documents (Small Text) — comma-separated list
| Gap | Type | Fix |
|---|---|---|
state is single Link, must be multi-value | MISSING | Convert to Table MultiSelect (child doctype with state Link → State). |
gender / caste / marital_status / occupation each need to allow multi-value | MISSING | Same: Table MultiSelect children. |
age_min, age_max, income_max derived columns | MISSING | Add as Int; populated by importer from the regex. Original regex kept in age_rule / income_rule for fidelity. |
required_documents as Table MultiSelect of Scheme Required Document | MISSING | Currently select_documents is comma-separated Small Text. Should be a child doctype so the Document Application flow can reuse the canonical list. |
Raw age_rule and income_rule (free text) | KEEP | Source-of-truth regex from PM. Derived min/max are the primary lookup keys. |
| 0 schemes tagged to Maharashtra | NOT A DEFECT | This is V2 prod truth. V3 inherits. |
| Kerala has 2 schemes only | NOT A DEFECT | V2 prod truth. |
7. Implementation plan
- Schema migration in
mform_swasti:- Create child doctypes:
Scheme Eligibility State,Scheme Eligibility Gender,Scheme Eligibility Caste,Scheme Eligibility Marital,Scheme Eligibility Occupation,Scheme Required Document. - Migrate Scheme Master fields to Table MultiSelect.
- Add derived:
age_min,age_max,income_max.
- Create child doctypes:
- Document Type seeder (
seeders/document_types_canonical.py):- Normalise the 165 raw columns → 162 canonical Title-Case names.
- Upsert into a new
Scheme Required Documentdoctype. - Link the 7 Member-form-aligned names to the existing
Civic Document Typerows.
- Scheme Master CSV importer (
seeders/scheme_master_csv.py, mirror the donor importer):- Read all 186 rows.
- For each one-hot family column with a non-empty cell, append a child row.
- Parse the age/income regexes to derive numeric bounds.
- Keep
legacy_mongo_id = Transaction Idfor idempotent re-runs.
- Frappe applicable-schemes API —
mform_swasti.api.applicable_schemes(member_name):- Reads the member’s state / gender / caste / marital / occupation / age / income.
- Joins against the child doctypes, applies regex bounds.
- Returns name, description, link, required_documents.
- Mobile: replace the placeholder Scheme Application picker with a search-over-applicable list, calling the API on screen open. Cache locally so the surveyor can work offline after first open.
8. Open items
- Empty-family semantics — asked Fahim 2026-05-20 in
mForm V3 <> Swasti. Current reading: a scheme with 0 values filled in a family is excluded for everyone. Confirm before locking the seeder logic. - Document canonicalisation — confirmed: standardise to Member-form Title Case on our side; no V2-side cleanup needed. (Fahim’s confirmation also pending — informational.)
- Age / income regex format for future iterations — would Fahim prefer to send
age_min, age_max, income_maxas numeric columns next time? Quality-of-life ask; not blocking.
This is the source of truth. Future Scheme Application work must reference this note. Update when Fahim resolves the open items above.