INDEX
Explanations
linguistic terms and researchers
New Auto-Interp
Negative Logits
primers
0.41
relational
0.40
integrand
0.39
лан
0.38
cond
0.37
input
0.36
primer
0.36
lan
0.36
epic
0.36
geometries
0.36
POSITIVE LOGITS
lingu
1.00
Lingu
0.95
Linguistic
0.95
lingu
0.94
linguistic
0.91
Linguistics
0.91
lingü
0.88
linguistique
0.86
linguistics
0.85
ling
0.79
Activations Density 0.012%