A visual slice of the first few hundred domains our crawler processed. Each card shows the trust score, color tint, technical signals, and how Kommento would treat that host.
At the code level, Kommento exposes a small Safety API surface for crawling and scoring the web, and returns structured records for each URL and host.
// High-level operations (pseudocode)
// Crawl & build the Safety Index
crawlWeb(seedList, options) {
// fetch URLs, parse HTML, compute technical signals
// write SafetyRecord rows into the Safety Index
}
// Get a full safety record for a URL
getSafetyRecord(url) => SafetyRecord {
// returns trustScore, tint, signals, categories, reasonCode, timestamps
}
// Check if a URL is allowed for a given profile
isBlocked(url, profile) => boolean {
// profile: "adult_default" | "teen" | "kid"
// uses SafetyRecord + HostProfile + policy
}
// Get an aggregated host profile (what we store/license)
getHostProfile(domain) => HostProfile {
// returns trust band, categories, defaultDecision, profileDecisions
}
This is the narrow “front door” for engineers: crawl, get a score, ask if something is blocked for a profile, or pull the aggregated host profile.
{
"url": "https://example.com/",
"urlHash": "abcd1234…",
"domain": "example.com",
"trustScore": 93,
"tint": "green",
"riskTier": "low",
"categoryMask": 0, // bitfield: PORN, SCAM, etc.
"signalMask": 8,
"signals": ["MANY_THIRD_PARTY_SCRIPTS"],
"trackerCount": 3,
"thirdPartyScriptCount": 7,
"isHttps": true,
"hasMixedContent": false,
"hasInsecureForm": false,
"aiScore": 0,
"modelVersion": 3,
"reasonCode": null,
"lastCheckedAt": "2025-12-05T17:40:28.832Z"
}
Each crawl produces a raw SafetyRecord: one row per URL with a trustScore, tint color, technical signals, categories, and an optional reason code.
From these raw per-URL records, Kommento builds stable HostProfiles: per-domain trust bands, categories, and ALLOW / WARN / BLOCK decisions for each profile. This is the dataset we can store, analyze, and license.
{
"domain": "example.com",
"rootDomain": "example.com",
"trustBand": "GREEN_HIGH", // GREEN_HIGH, GREEN, LIGHT_GREEN, YELLOW, ORANGE, RED_LOW
"riskTier": "low", // low, medium, elevated, high, blocked
"contentCategories": [], // e.g. ["PORN"]
"securityCategories": [], // e.g. ["MALWARE","PHISHING"]
"defaultDecision": "ALLOW",
"profileDecisions": {
"adult_default": "ALLOW",
"teen": "ALLOW",
"kid": "ALLOW"
},
"primaryReason": "TECHNICAL_CLEAN",
"secondarySignals": ["HTTPS_OK"],
"firstSeenAt": "2025-12-05T17:40:28.832Z",
"lastReviewedAt": "2025-12-05T17:40:28.832Z",
"sourceFeeds": ["kommento_crawler"]
}
This aggregated HostProfile is what downstream products consume: a compressed, explainable verdict per domain that can power extensions, parental controls, ISPs, and security tools.
Big platforms and government sites scoring in the 90–100 band. These become automatic green “anchor” signals in Kommento’s SERP overlays and toolbars.
Tier-1 global search provider. Clean HTTPS, no mixed content; becomes a “trusted green” anchor in Kommento.
Official U.S. government site. Automatically treated as highly trustworthy and highlighted for users.
Major payment processor. Clean technical posture; foundation for “safe to transact” hints in Kommento.
World Health Organization. High trust score, with minor technical noise from third-party scripts.
Still generally safe, but our scanner sees mixed content or heavy third-party script use. These show up as yellow or light green in SERPs to draw the eye without scaring the user.
Open-source CMS. Good overall posture, but serves some assets over HTTP (mixed content) so we flag it as “slightly noisy”.
Legacy download site. Our crawler sees mixed content plus many third-party scripts, so we downgrade to yellow as a visual “heads-up”.
Form builder platform. Safe, but with enough embedded assets and mixed content that Kommento gently warns the user via a yellow tint.
Ad network. Technically acceptable but noisy from an ads/trackers standpoint. This is where future “privacy-aware” profiles can clamp down harder.
These are the sites Kommento would surface as red: porn, extreme tracking infrastructure, and eventually scams and malware. They all fall under 60 on the 0–100 scale.
Explicit adult site. Detected via our porn host list and meta-content parser. Forced into the red band, ready for teen/kid profiles to auto-block.
Adult dating / escort-style site. Classified as PORN and visually locked into red so it’s impossible to mistake for a safe result.
Tracking / analytics infrastructure. Not “malware”, but heavily classified as a tracker, so we push it into red for privacy-sensitive modes.
Error conditions (403, timeouts, certificate problems) are kept separate in a grey “no score” bucket. They don’t pollute trust scores but are still visible for diagnostics.
The crawler could not fetch content (403). We treat this as an unreachable host rather than guess a score.
Connection timed out. Logged as an error so it’s visible in analytics, but excluded from trust scoring.
TLS certificate does not match hostname. This is a strong technical red flag and is surfaced as a separate error row.