An open-intelligence platform that fuses natural language understanding, real-time orchestration, and semantic retrieval to reimagine how people interact with information systems — at any scale, with near-zero infrastructure cost.
CCAS is a self-hosted, cost-efficient AI conversation platform — combining semantic search, dynamic knowledge retrieval, and domain-aware NLP into a single cohesive experience.
CCAS moves beyond keyword matching. Every query is semantically analyzed, routed to the most relevant knowledge domain, and answered with contextual precision. The system understands meaning — not just words.
Built as a Viswanext initiative to demonstrate that enterprise-grade conversational AI doesn't require enterprise-grade infrastructure budgets.
A warm-container semantic cache stores vector embeddings of answered queries. Cosine similarity matching at threshold 0.97 serves cached results instantly — eliminating redundant compute on common questions.
AI, Strategy, FinOps, Quantum, Design, Agentic, FDIP, and Human Wisdom — each query is intelligently routed to the right domain using keyword-aware classification before semantic retrieval begins.
Every component was designed for independence — each pillar can evolve without breaking the others.
Embedding-based similarity search using BAAI/bge-small-en-v1.5 via FastEmbed. Cosine distance scoring selects the top-3 most relevant paragraphs from live website content — no static knowledge base required.
Rather than pre-indexing documents, CCAS fetches live HTML from a curated registry of knowledge pages at query time — using parallel threads (up to 15) to minimize latency. Always current, never stale.
A lightweight keyword classifier maps each incoming query to one of eight knowledge categories before retrieval begins. This narrows the URL pool and dramatically improves relevance with zero additional model calls.
Warm Lambda containers carry a rolling cache of up to 50 query-vector pairs. Semantically similar repeat queries are resolved without any web fetching or embedding recomputation — pure speed.
Login handled via a dedicated AWS Lambda + API Gateway endpoint. Credentials verified server-side with hashed storage in S3. Session token stored in localStorage with auto-redirect logic — simple and secure.
Fully serverless on AWS — Lambda, API Gateway, S3, and CloudFront. No servers running when idle. Monthly cost in active use stays within AWS Free Tier limits for moderate traffic. Pay only for what you invoke.
The entire platform runs on five AWS primitives — each handling a distinct concern, each independently scalable, and together delivering a coherent AI experience.
HTML + JS hosted on S3
Global edge delivery + HTTPS
REST endpoints + CORS
RAG + NLP + Router
Real-time knowledge fetch
Stateless request pipeline — each invocation is fully independent
Eight knowledge domains, real-time retrieval, and a conversational interface that feels as natural as asking a colleague.
Ask about LLMs, embeddings, transformers, RAG, ML pipelines, responsible AI, NLP, and the full breadth of applied machine learning — answered from live curated sources.
Query the FDIP domain for scenario modeling, risk intelligence, ROI analysis, executive dashboards, and enterprise financial decision frameworks.
Explore CTO frameworks, blue ocean strategy, executive presence, critical thinking models, future readiness, and technology advisor playbooks.
Understand budgeting and planning, DuPont analysis, ratio analysis, risk and governance frameworks, and financial fundamentals for cloud operations.
Explore qubits, quantum entanglement, quantum algorithms, hardware platforms, and the future trajectory of quantum computing in enterprise contexts.
Query patterns for Kubernetes, serverless, CI/CD, IaC, monitoring, auto-scaling, multi-region redundancy, containerization, and cloud-native design.
Understand autonomous agents, multi-agent orchestration, tool use, planning loops, human-in-the-loop designs, and the emerging agentic AI landscape.
A unique domain dedicated to wisdom, integrity, compassion, mindful awareness, and the principles of thoughtful leadership — rare in AI systems, intentional here.
"The most powerful AI system isn't the one with the most parameters — it's the one that understands what you're actually asking and responds with knowledge that matters."
— Viswanext Research Initiative, CCAS
CCAS orchestrates a five-step pipeline entirely within a single serverless invocation — no orchestration servers, no queues, no waiting.
Before any computation begins, the incoming query is vectorized and compared against the in-memory cache of previously answered questions. If a semantically equivalent query exists (cosine similarity > 0.97), the cached answer is returned instantly with zero web fetching.
A lightweight keyword classifier maps the query to one of eight knowledge domains (AI, Strategy, FDIP, FinOps, Quantum, Design, Agentic, Think). This narrows the retrieval URL pool to the most relevant knowledge sources before any fetching occurs.
Using Python's ThreadPoolExecutor with up to 15 concurrent workers, CCAS fetches the HTML content of all URLs in the classified domain simultaneously. An HTMLParser extracts clean paragraph text, stripping navigation, scripts, and structural noise.
Every extracted paragraph is embedded using BAAI/bge-small-en-v1.5. Cosine similarity scores are computed between the query embedding and all paragraph embeddings. The top-3 paragraphs above a relevance threshold of 0.32 are selected as the answer context.
The top matching paragraphs are packaged with their domain label and source URL into a structured JSON response. The query vector and response are stored in the semantic cache for future lookups. The answer reaches the user's browser via CloudFront in under 2 seconds for cold starts, under 100ms for cache hits.
CCAS is a Viswanext independent initiative — a live demonstration that modern conversational AI can be built with commodity open-source components, serverless infrastructure, and genuine engineering thoughtfulness.
No venture funding. No proprietary models. No hidden complexity. Just clean architecture, good judgment, and a commitment to making AI understanding accessible.
The platform serves as a working reference for anyone building RAG-based systems, semantic search, or serverless AI backends on AWS.
FastEmbed, NumPy, Python standard library — every dependency is open source and freely available. No API keys to third-party models required.
CCAS demonstrates live-fetch RAG, semantic caching, and domain routing as a complete working system — not a toy demo, a real production pattern.
AWS Lambda free tier covers 1M invocations/month. S3 and CloudFront cost cents at typical usage. The entire platform can run free for personal and small-team workloads.
Adding a new knowledge domain requires updating a WEBSITES dictionary and a single keyword condition. The entire architecture scales horizontally with zero structural change.
Log in to explore the full conversational intelligence interface, query all knowledge domains, and experience CCAS in action.
Secure authentication via AWS Lambda · Session-based access