DATA AUDIT — ML¶
Overall integrity grade: D¶
Prices are usable; news/social are essentially worthless due to massive ticker-symbol collision; two of four SET pages are 404s; company name in config is unverifiable from package and looks suspect.
Confidence the data is REAL (not hallucinated): Medium¶
Price file looks like genuine yfinance output. The risk is not fabricated data but contaminated/mis-attributed data and an unverified company-name claim in the config.
Findings by source¶
Prices — CAVEAT¶
- 60-session OHLC ordering checks pass (low ≤ open/close ≤ high) on every row I sampled. No negative/zero prices.
- Dates are monotonic, business-day cadence, last session 2026-06-22 vs build timestamp 2026-06-23 — fresh.
- Several zero-volume days that DO show prices:
2026-05-01,2026-05-04,2026-05-26,2026-06-01,2026-06-03. These align with Thai public holidays (Labour Day, Coronation Day, Visakha Bucha, etc.), where yfinance is back-filling the prior close with vol=0. Plausible, but downstream code should treat these as non-trading days, not as real prints. - Massive price action that needs flagging, not removing: 2026-05-22 close 0.43 → 2026-06-22 close 1.51 (≈3.5× in ~1 month) with volume blowing out from ~100k–500k/day to 35.8M on 2026-06-05 and 31.1M on 2026-06-12. This is real-looking data of a low-priced Thai small-cap going parabolic — not corrupt, but downstream personas must not mistake the recent run for "fundamentals."
- 10y summary row is internally INCONSISTENT: annualised return
-10.22%, vol10.09%, but Sharpe+0.21. A negative annualised return cannot produce a positive Sharpe at any sane risk-free rate. Either the return field, the Sharpe field, or both are mislabelled in the pipeline. Max drawdown-86.25%is plausible given the chart shape but cannot be re-derived from the 60-row excerpt. Do not quote the summary-row Sharpe. - No split/dividend adjustment metadata in the package — Adj Close == Close on every visible row, so any historical comparison assumes no adjustments were needed (unverified).
News — FAIL¶
yahoo: 0 items.google_company: 0 items.google_ticker: 38 items, but zero are about Microlistens / Mida / ML the SET stock. Sample titles: "OR Sets 2030 Sustainability Goals", "Kitipong wins another term as SET chairman", "SET Thailand builds AI info system", "On-Set Beauty Secrets From 'The White Lotus'", "Drone boom takes off in Thailand", "BBC One's serial killer drama filmed in Thailand". These match on the word "SET" (Stock Exchange of Thailand / TV set / film set), not on ticker ML. The news feed produced no signal about this company whatsoever.- SET news page is a Nuxt SSR shell (
<title>ML - News - The Stock Exchange of Thailand</title>is genuine but the actual news list is hydrated client-side and not present in the excerpt).
Social — FAIL¶
google_blogs(30 items): all about "ML" = machine learning ("Train and Test Split", "Five Methods for Data Splitting", "Machine Learning Yearning"). Zero relevance.google_forums(30 items): "ML" = Mobile Legends, machine learning, male lead in K-drama ("Who's the ending ML in this series? [Cassmire: The Loyal Sword]"). Zero relevance.reddit_companyandreddit_ticker: both HTTP 403, no data.Stocktwits: every single post is about $ML = MoneyLion (US Nasdaq) — "MoneyLion Holdings has agreed to settle… $12.75 million $ML Settlement", "Moneylion $ML before it ran 600%". Wrong company, wrong country, wrong exchange. Downstream personas must not read these as Thai sentiment.- X/Twitter: explicit coverage gap noted in package (paywalled since 2023). Thai-language venues (Pantip, Blockdit, Thairath) not scraped.
SET pages — FAIL¶
profile(406KB) andfinancial(428KB): titles resolve correctly (ML - Company profile / Company highlight - The Stock Exchange of Thailand), but the excerpts shown are pure Nuxt SSR boilerplate (head/meta/script/preload). The actual profile fields, business description, shareholder table, and financial highlight numbers are rendered client-side and not present in the bytes provided. I cannot verify business line, sector classification, shareholders, paid-up capital, or any financial metric from this package.shareholders(392KB): 404 page —<title>ขออภัย ไม่พบข้อมูลที่คุณต้องการ</title>("Sorry, data not found"),og:url = /en/error/404. Page does not exist or URL is wrong.filings(392KB): 404 page — same Thai error string and/en/error/404URL.news(414KB): SSR shell, no extractable content in excerpt.
Contradictions found¶
- Config "Microlistens (Thailand) PCL" vs business "Hire-purchase financing for trucks and commercial vehicles, THANI competitor": the name "Microlistens" does not naturally describe a commercial-vehicle auto-finance company; it sounds like a tech/audio brand. SET profile excerpt is an SSR shell so the actual registered company name is not verifiable from the bytes provided. Possible config error — flag, do not use the name as authoritative.
- Stocktwits $ML (MoneyLion, US) vs SET ML (Thai hire-purchase): completely different companies. Any sentiment read off Stocktwits would be mis-attributed.
- 10y summary: negative annualised return paired with positive Sharpe (see Prices section). Pipeline arithmetic or column-label bug.
Missing / stale data¶
- No income statement, balance sheet, cash-flow numbers (financial.html is a JS shell).
- No shareholder list (404).
- No filings list / 56-1 One Report / opportunity day deck (404).
- No dividend history.
- No insider-transaction list.
- No analyst coverage / consensus.
- No short-sale / NVDR / foreign-holding data.
- No Thai-language news or forum coverage (Pantip / Blockdit / Settrade webboard).
- No company-specific English news in last 12 months.
- No corporate actions log (so the recent 3× spike has no explanatory event in the package).
- No sector peer data (THANI numbers asserted as competitor but not present).
DO-NOT-FABRICATE list for downstream personas¶
Personas must NOT invent or quote any of the following — they are absent from the package: - Revenue, net income, EPS, book value, NIM, NPL ratio, loan book size, ROE, ROA. - Any Beneish M-score, Altman Z-score, Piotroski F-score, Sloan accruals — no financial statements are present. - P/E, P/B, dividend yield, market cap, free float, share count. - Major shareholder names or % stakes — shareholders page is 404. - Recent SET filings, board changes, capital actions, MD&A commentary — filings page is 404. - Reason for the May–June 2026 price spike — no news in package explains it. - The company's actual registered name and business description beyond what the config asserts — SET profile content is not in the excerpt. - Any Stocktwits or X sentiment attributed to this stock — Stocktwits data is for US MoneyLion, not SET:ML. - The summary-row Sharpe of 0.21 — it conflicts with the negative annualised return; treat as bug.
One-line instruction to the CIO¶
Trust only the daily OHLCV (note the parabolic late-May/June move and the holiday-driven zero-volume rows); treat every other source in this package — news, blogs, forums, Stocktwits, and all four SET HTML pages — as either ticker-collision noise, 404s, or unrendered JS shells, and do not allow any fundamental, shareholder, or sentiment claim about ML to be made from this data.