Skip to content

horus84/Asymmetric-0T-Sequence-Discovery-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Unsupervised Sequence Discovery via Directed Optimal Transport

This repository contains a preliminary prototype for the Markov-Transport Correlation Coefficient (MTCC).

Abstract

While current representation learning frameworks (like OT-CPCC) excel at embedding symmetric hierarchies, sequential data exhibits strict directional asymmetry. This project proposes a novel information-geometric metric that correlates latent Wasserstein distances between contextual distributions with their empirical Markovian transition costs.

Key Result

In testing on the undeciphered Indus script, the latent space of a standard LSTM exhibited a directed OT-correlation of 0.2211 with a highly significant p-value of 8.08e-06. This suggests that while networks implicitly learn grammatical flow, they require explicit OT regularization (e.g., FastFT) to maximize this alignment.

The repository includes indus_dataset_anonymized.json, a representative sample of the 6,000+ inscription corpus I curated for this research. While the results reported in my proposal were derived from the full dataset, this sample is provided to demonstrate the data structure and the reproducibility of the MTCC pipeline.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages