\[ \require{mathtools} \let\DeclarePairedDelimiter\DeclarePairedDelimiters % MathJax typo %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % symbol %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % sets \newcommand{\N}{\mathbb{N}} % natural numbers \newcommand{\Z}{\mathbb{Z}} % integers \newcommand{\Q}{\mathbb{Q}} % rational numbers \newcommand{\R}{\mathbb{R}} % real numbers \newcommand{\C}{\mathbb{C}} % complex numbers % category \newcommand{\catf}[1]{{\mathbf{#1}}} \newcommand{\cA}{\catf{A}} \newcommand{\cB}{\catf{B}} \newcommand{\cC}{\catf{C}} \newcommand{\cD}{\catf{D}} \newcommand{\cE}{\catf{E}} \newcommand{\cF}{\catf{F}} \newcommand{\cG}{\catf{G}} \newcommand{\cH}{\catf{H}} \newcommand{\cI}{\catf{I}} \newcommand{\cJ}{\catf{J}} \newcommand{\cK}{\catf{K}} \newcommand{\cL}{\catf{L}} \newcommand{\cM}{\catf{M}} \newcommand{\cN}{\catf{N}} \newcommand{\cO}{\catf{O}} \newcommand{\cP}{\catf{P}} \newcommand{\cQ}{\catf{Q}} \newcommand{\cR}{\catf{R}} \newcommand{\cS}{\catf{S}} \newcommand{\cT}{\catf{T}} \newcommand{\cU}{\catf{U}} \newcommand{\cV}{\catf{V}} \newcommand{\cW}{\catf{W}} \newcommand{\cX}{\catf{X}} \newcommand{\cY}{\catf{Y}} \newcommand{\cZ}{\catf{Z}} \newcommand{\cZero}{\catf{0}} \newcommand{\cOne}{\catf{1}} \newcommand{\cTwo}{\catf{2}} \newcommand{\cArr}{\catf{Arr}} % arrow \newcommand{\cPSh}{\catf{PSh}} % presheaf \newcommand{\cCone}{\catf{Cone}} % cone \newcommand{\cCocone}{\catf{Cocone}} % cocone \newcommand{\cFin}{\catf{Fin}} % finite prefix \newcommand{\cSet}{\catf{Set}} % functions \newcommand{\cRel}{\catf{Rel}} % relations \newcommand{\cPos}{\catf{Pos}} % posets \newcommand{\cMon}{\catf{Mon}} % monoids \newcommand{\cCMon}{\catf{CMon}} % commutative monoids \newcommand{\cGrp}{\catf{Grp}} % groups \newcommand{\cAb}{\catf{Ab}} % abelian groups \newcommand{\cCat}{\catf{Cat}} % categories \newcommand{\cGrpd}{\catf{Grpd}} % groupoids \newcommand{\cVect}{\catf{Vect}} % vector spaces \newcommand{\cMat}{\catf{Mat}} % matrices \newcommand{\cTop}{\catf{Top}} % topological spaces and continuous maps \newcommand{\cMet}{\catf{Met}} % metric spaces and metric maps \newcommand{\cMeas}{\catf{Meas}} % measurable spaces and measurable functions \newcommand{\cStoch}{\catf{Stoch}} % measurable spaces and stochastic maps \newcommand{\cProb}{\catf{Prob}} % probability measures and measure-preserving functions %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % supscript %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\op}{^\mathrm{op}} \newcommand{\inv}{^{-1}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % function %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \DeclareMathOperator{\lcm}{lcm} \DeclareMathOperator{\logsumexp}{log-sum-exp} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % arrow %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % morphism \newcommand{\xto}{\xrightarrow} % -> \newcommand{\xot}{\xleftarrow} % <- \newcommand{\toot}{\rightleftarrows} % <=> \newcommand{\xtoot}[2]{\xrightleftharpoons[#2]{#1}} \newcommand{\iso}{\cong} % ~= \newcommand{\klto}{\rightsquigarrow} % ~> \newcommand{\mono}{\rightarrowtail} % >-> \newcommand{\epi}{\twoheadrightarrow} % ->> \newcommand{\ihom}{\multimap} % -o % category \newcommand{\incl}{\hookrightarrow} \newcommand{\adjto}[2]{\overset{{}\xto[]{#1}{}}{\underset{{}\xot[#2]{}{}}{\bot}}} % functor \newcommand{\nat}{\Rightarrow} % => \newcommand{\xnat}{\xRightarrow} % => \newcommand{\comma}{\downarrow} \newcommand{\adj}{\dashv} % -| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % delimiter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \DeclarePairedDelimiter{\parens}{(}{)} % parentheses ( ) \DeclarePairedDelimiter{\bracks}{[}{]} % brackets [ ] \DeclarePairedDelimiter{\braces}{\{}{\}} % braces { } \DeclarePairedDelimiter{\angles}{\langle}{\rangle} % angles \DeclarePairedDelimiter{\floor}{\lfloor}{\rfloor} \DeclarePairedDelimiter{\ceil}{\lceil}{\rceil} \newcommand{\set}{\braces} \newcommand{\singleton}{{\set{*}}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % operator %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\compL}{\mathbin{\circ}} \newcommand{\compR}{\mathbin{;}} \newcommand{\Kl}{\mathrm{Kl}} \DeclareMathOperator{\Obj}{Obj} \DeclareMathOperator{\Hom}{Hom} \DeclareMathOperator{\Sub}{Sub} \DeclareMathOperator{\Eq}{Eq} \DeclareMathOperator{\Coeq}{Coeq} \DeclareMathOperator{\Pull}{Pull} \DeclareMathOperator{\Push}{Push} \DeclareMathOperator{\Lan}{Lan} \DeclareMathOperator{\Ran}{Ran} \DeclareMathOperator{\Lift}{Lift} \DeclareMathOperator{\Rift}{Rift} \DeclareMathOperator{\id}{id} \DeclareMathOperator{\colim}{colim} % exponential \DeclareMathOperator{\partf}{partial} \DeclareMathOperator{\curry}{curry} % natural number \DeclareMathOperator{\zero}{zero} \DeclareMathOperator{\successor}{succ} % list \DeclareMathOperator{\nil}{nil} \DeclareMathOperator{\cons}{cons} \newcommand{\concat}{\mathbin{{+}\mspace{-8mu}{+}}} % fold \DeclareMathOperator{\fold}{fold} \DeclareMathOperator{\map}{map} \DeclareMathOperator{\filter}{filter} % bool \DeclareMathOperator{\cond}{cond} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % order %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\meet}{\wedge} \newcommand{\join}{\vee} \newcommand{\bigmeet}{\bigwedge} \newcommand{\bigjoin}{\bigvee} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % logic %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\lT}{\top} \newcommand{\lF}{\bot} % logical connective \newcommand{\binop}[1]{\mathbin{#1}} % conjunction AND \newcommand{\lcon}{\binop{\wedge}} % not conjunction NAND \newcommand{\lncon}{\binop{\uparrow}} % disjunction OR \newcommand{\ldis}{\binop{\vee}} % not disjunction NOR \newcommand{\lndis}{\binop{\downarrow}} % implication \newcommand{\limp}{\binop{\rightarrow}} % not implication \newcommand{\lnimp}{\binop{\nrightarrow}} % converse implication \newcommand{\lcim}{\binop{\leftarrow}} % not converse implication \newcommand{\lncim}{\binop{\nleftarrow}} % equivalence XNOR \newcommand{\leqv}{\binop{\leftrightarrow}} % not equivalence XOR \newcommand{\lneqv}{\binop{\nleftrightarrow}} \newcommand{\lproves}{\binop{\vdash}} % |- \newcommand{\lmodels}{\binop{\vDash}} % |= \newcommand{\lnproves}{\binop{\nvdash}} % |/- \newcommand{\lnmodels}{\binop{\nvDash}} % |/= \newcommand{\truth}{{\set{\lT, \lF}}} \newcommand{\quant}{{[0, \infty]}} \newcommand{\inter}{{[0, 1]}} \newcommand{\qnot}{\mathord{\sim}} \newcommand{\qcon}{\binop{\otimes}} \newcommand{\qdis}{\binop{\oplus}} \newcommand{\qimp}{\binop{\multimap}} \DeclareMathOperator*{\qforall}{\triangledown} \DeclareMathOperator*{\qexists}{\vartriangle} \]
\[ \newcommand{\policy}{\pi} \newcommand{\transition}{\mathrm{p}} \newcommand{\reward}{\mathrm{r}} \newcommand{\statistic}{\tau} \newcommand{\svalue}{\mathrm{v}} % state value function \newcommand{\qvalue}{\mathrm{q}} % state-action value function \DeclareMathOperator{\gen}{gen} \DeclareMathOperator{\agg}{agg} \DeclareMathOperator{\fsum}{sum} \DeclareMathOperator{\init}{init} \DeclareMathOperator{\post}{post} \newcommand{\update}{\mathbin{\triangleright}} \]
- 張 一凡 · チョウ イーファン · [ʈʂāŋ īː fǽn]
- Assistant Professor, Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo
- Visiting Scientist, Imperfect Information Learning Team, RIKEN Center for Advanced Intelligence Project
- [yivan.xyz@gmail.com] [yivan.zhang@k.u-tokyo.ac.jp]
- [CV] [Google Scholar] [OpenReview] [DBLP] [GitHub]
I’m an assistant professor at The University of Tokyo, where I work on the theory and application of machine learning. I was previously a postdoctoral researcher at RIKEN Center for Advanced Intelligence Project. I obtained my Ph.D. from The University of Tokyo, where I was fortunate to be advised by Prof. Masashi Sugiyama.
Lately, I’ve been exploring algebra and applied category theory in machine learning. My long-term goal is to understand how the structure of learning systems gives rise to intelligent behavior. I aim to uncover how perception, abstraction, reasoning, and creativity emerge from first principles, and how algebraic tools can help manage their growing complexity. I’m especially interested in how intelligent agents can acquire, represent, and reuse knowledge and skills in structured ways, and how they can behave in alignment with human values, even under minimal supervision.
Over the years, I’ve become fascinated by both foundational formalisms—like formal definitions of disentangled representations or reward aggregation in reinforcement learning—and fundamental challenges, such as compositional generalization and limited supervision. I’m drawn to questions that appear across domains in different forms, and I seek unifying abstractions that reveal their shared structure. These are the kinds of problems that resist quick fixes, but reward careful formulation and deep theoretical insight. I believe that we must ask the right questions before looking for the right answers.
That’s why I value abstractions that clarify rather than obscure complexity. For me, category theory offers such a lens: it reveals deep structure, offers diagrammatic reasoning, highlights compositionality, and helps manage complexity by showing how parts relate to wholes. While much of my thinking is rooted in abstract mathematics, I strive to translate abstract theories into practical intuitions that resonate with the broader machine learning community.
Concretely, my research spans the following topics:
- Foundational machine learning
- algebraic formalisms and techniques
- explainable decision-making under uncertainty
- weak supervision
- Representation learning
- disentanglement
- symmetry and equivariance
- abstraction and structured representation
- neural-symbolic learning and reasoning
- Reinforcement learning
- non-standard problem formulations
- compositional and hierarchical learning
- safety and alignment
- exploration
I love things that are discovered, not invented.
My research is/was generously supported by RIKEN Center for Advanced Intelligence Project (RIKEN AIP), Japan Society for the Promotion of Science (JSPS), Japan Science and Technology Agency (JST), and Microsoft Research Asia (MSRA).
If you have a specific research idea you’d like to discuss, feel free to get in touch via email. I’m also happy to meet in person for a coffee if you’re in Tokyo. Prospective students are encouraged to first consult the Graduate School of Frontier Sciences or the Graduate School of Information Science and Technology at The University of Tokyo for further information before reaching out.
News
- August 2025: Co-organizing an IJCAI’25 tutorial, “AI Meets Algebra: Foundations and Frontiers” in Montréal, Canada
- May 2025: “Recursive Reward Aggregation” accepted to RLC’25 in Edmonton, Canada
Featured publications
Recursive Reward Aggregation
- reinforcement learning, Bellman equation, dynamic programming, recursion scheme, recursive coalgebra, algebra fusion, lenses and optics
- Yuting Tang, Yivan Zhang, Johannes Ackermann, Yu-Jie Zhang, Soichiro Nishimori, Masashi Sugiyama
- Reinforcement Learning Conference 2025 (RLC’25)
- [arXiv] [OpenReview] [code] [poster]
Summation is a recursive function, so are many others.
The discounted sum \({\fsum}_\gamma(r_{1:t}) = r_1 + \gamma r_2 + \gamma^2 r_3 + \cdots + \gamma^{t-1} r_t = r_1 + \gamma \; (r_2 + \gamma \; (\cdots + \gamma \; (r_t + 0))) = r_1 + \gamma \; {\fsum}_\gamma(r_{2:t})\) is a recursive function. So are many others—length, min, max, or sum of squares. From these, we can compute common statistics like the range, mean, or variance. For example, to get the variance, we can keep track of the length, sum, and sum of squares, and then use a final formula to combine them.
In reinforcement learning, rewards are generated step by step over time as an agent takes actions. If we also aggregate these rewards step by step, we can keep track of the reward statistics as we go. Using ideas from algebra, we can describe the recursive generation of rewards as a hylomorphism (a recursive coalgebra homomorphism, building a list from a seed value) and the recursive aggregation of rewards as a catamorphism (the initial algebra homomorphism, reducing a list to a single value). A technique called algebra fusion lets us merge these two steps into a single recursive statistic function, which gives us a generalized Bellman equation. This algebraic perspective allows us to extend many value-based and actor-critic methods to a broad class of recursive reward aggregations.
Concretely, let \(S\), \(A\), \(R\), \(T\) be the sets of states, actions, rewards, and statistics. For the variance, the statistics \(T = \N \times \R \times \R_{\geq 0}\) are the length \(n\), sum \(s\), and sum of squares \(q\) of a sequence of rewards. For a policy \(\policy: S \to A\), a transition function \(\transition: S \times A \to S\), and a reward function \(\reward: S \times A \to R\) shown below, the statistic function \(\statistic_\policy: S \to T\) can be computed recursively using
- an update function \(\update: R \times T \to T: (r, [n, s, q]) \mapsto [n + 1, s + r, q + r^2]\) for non-terminal states, and
- an initial value \(\init = [0, 0, 0] \in T\) for terminal states.
Then, the value function \(\svalue_\policy: S \to R\) can be obtained by applying a post-processing function \(\post: T \to R: [n, s, q] \mapsto \frac{q}{n} - \left(\frac{s}{n}\right)^2\) after the statistic function \(\statistic_\policy: S \to T\).
Enriching Disentanglement: From Logical Definitions to Quantitative Metrics
- disentangled representation learning, enriched category theory, logic-metric correspondence, endofunctor algebra (sub)homomorphism
- Yivan Zhang, Masashi Sugiyama
- Neural Information Processing Systems 2024 (NeurIPS’24)
- Annual Workshop on Topology, Algebra, and Geometry in Machine Learning (TAG-ML) at International Conference on Machine Learning 2023 (ICML’23)
- [arXiv] [OpenReview] [NeurIPS’24] [TAG-ML’23] [poster]
"How constant" is a non-constant function?
A constant function maps all inputs to the same output, but “how constant” is a non-constant function? Intuitively, a function is “more constant” if its output changes little when its input changes. We can measure constancy by checking how “spread out” the outputs are. This gives us a way to turn a yes-or-no question (“is it constant?”) into a numeric score (“how constant is it?”). Can we do this for other properties of a function?
In this work, we show how to do this kind of thing more generally. We take logical definitions—written using things like “equals”, “implies”, and “for all”—and turn them into quantitative metrics that tell us “how true” the property is. We do this by replacing each part of the logical formula with a matching quantitative counterpart: we replace the true/false values \(\truth\) with non-negative real numbers \(\quant\), replace equality \({=}: A \times A \to \truth\) with a kind of distance \(d: A \times A \to \quant\), and replace things like “for all” \(\forall\) with ways of combining values, like taking the maximum or average. This gives us a way to turn logical formulas into loss functions.
For example, consider the property of a function \(f: A \to B\) being constant. Logically, this means either
- (p1) \(\exists b \in B.\; \forall a \in A.\; f(a) = b\) all outputs equal to the same value, or equivalently,
- (p2) \(\forall a, a' \in A.\; f(a) = f(a')\) all outputs equal to each other.
To turn them into quantitative metrics, we can replace equality with a distance and quantifiers with aggregations and measure
- (q1) \(\inf_{b \in B} \operatorname{agg}_{a \in A} d(f(a), b)\) how far the outputs are from a central point, or
- (q2) \(\operatorname{agg}_{a, a' \in A} d(f(a), f(a'))\) how far the outputs are from each other.
Here, \(\operatorname{agg}\) can be any aggregation function that outputs zero if and only if all the inputs are zero, like the maximum, sum, or sum of squares. Then, both metrics give us a real number that tells us how close \(f: A \to B\) is to being constant.
The theory behind the logic-metric correspondence is actually quite simple and intuitive. It starts from simple facts like “the sum of two non-negative numbers is zero if and only if both numbers are zero.” Concretely, we can view the set \(\quant\) of non-negative real numbers equipped with the addition \(+: \quant \times \quant \to \quant\) as an “algebra” and the set \(\truth\) of binary truth values equipped with the conjunction \(\lcon: \truth \times \truth \to \truth\) as another algebra. They both have the “type” or “signature” \((-) \times (-) \to (-)\). For two algebras \((A, \alpha: A \times A \to A)\) and \((B, \beta: B \times B \to B)\), we can define an algebra homomorphism \(f: A \to B\) as a function that preserves the algebraic structure, meaning
\[f(\alpha(a, a')) = \beta(f(a), f(a'))\]
for all \(a, a' \in A\). Here, since we are interested in whether the metric is zero or not, we can consider the zero predicate \(\zeta: \quant \to \truth: x \mapsto (x = 0)\)—an “\(\texttt{is-zero}\)” function that returns true if and only if its input is zero. Then, “the sum of two non-negative numbers is zero if and only if both numbers are zero” can be rewritten as
\[\texttt{is-zero}(a + a') \leqv \texttt{is_zero}(a) \lcon \texttt{is_zero}(a')\]
for all \(a, a' \in \quant\). This exactly means that the zero predicate \(\zeta: \quant \to \truth\) is an algebra homomorphism from the algebra \((\quant, +)\) of non-negative real numbers to the algebra \((\truth, \lcon)\) of binary truth values.
Similarly, for a set \(A\), a quantity \(q: A \to \quant\) and a predicate \(p: A \to \truth\) can be viewed as algebras \((\quant, q: A \to \quant)\) and \((\truth, p: A \to \truth)\) with the same signature \(A \to (-)\). A quantity homomorphic to a predicate via a function \(f: \quant \to \truth\) means that
\[f(q(a)) = p(a)\]
for all \(a \in A\). For example, the distances \(d_A: A \times A \to \quant\) that we need are those measure to zero if and only if the pairs of inputs are equal, i.e.,
\[\texttt{is-zero}(d_A(a, a')) \leqv (a =_A a')\]
for all \(a, a' \in A\). We call them strict premetrics. Then, we can also say that strict premetrics are homomorhic to the equality predicate \(=_A: A \times A \to \truth\) via the zero predicate \(\zeta: \quant \to \truth\).
Why do we care about this? Why bother rewriting simple facts as algebra homomorphisms? Because this gives us a way to build up more complex metrics from simple ones, compositionally. We can prove that if each part of a logical formula is homomorphic to a part of a metric, then the whole formula will still match the whole metric. This lets us turn logical definitions into quantitative metrics in a principled, compositional way.
A Category-theoretical Meta-analysis of Definitions of Disentanglement
- disentangled representation learning, category theory, universal construction, monoidal product
- Yivan Zhang, Masashi Sugiyama
- International Conference on Machine Learning 2023 (ICML’23)
- International Workshop on Symbolic-Neural Learning 2023 (SNL’23)
- [arXiv] [OpenReview] [ICML’23] [slides] [poster]
Why are there so many definitions of disentanglement?
Why are there so many definitions of disentanglement? Some are based on functions, some on group theory, others on statistics, information theory, or even topology. At first, they seem totally different. But they all try to capture a similar idea: things should stay separate in a meaningful way, so different parts of a system should depend on different parts of the input. So, are these definitions really different? Or are they secretly connected?
In this work, we show that many of these definitions actually use different versions of the same structure—product. More precisely, we look at morphisms like \(f: A \otimes B \to C\) from a product \(A \otimes B\) of objects and ask: is it essentially a morphism \(f': A \to C\) that depends only on \(A\), not \(B\)? Depending on the setting, the product \(\otimes\) can be the Cartesian product \(\times\) of sets and functions, the direct product of groups and their actions, the joint distribution of random variables, or the product manifold in topology. No matter the case, a “disentangled encoder” is one that reconstructs the product structure.
We use category theory to explain this in a general and unified way. This helps clarify how different definitions relate to each other and gives researchers a common language to talk about them. It also makes it easier to choose definitions or design methods that match the structure of a given problem.
[product, product morphism, monoidal product]
Learning Noise Transition Matrix from Only Noisy Labels via Total Variation Regularization
- weakly supervised learning, noisy label learning, transition matrix, contraction mapping
- Yivan Zhang, Gang Niu, Masashi Sugiyama
- International Conference on Machine Learning 2021 (ICML’21)
- [arXiv] [ICML’21] [code] [slides] [poster]
Learning from Aggregate Observations
- weakly supervised learning, multiple instance learning, consistency up to an equivalence
- Yivan Zhang, Nontawat Charoenphakdee, Zhenguo Wu, Masashi Sugiyama
- Neural Information Processing Systems 2020 (NeurIPS’20)
- [arXiv] [NeurIPS’20] [code]