INDEX
Explanations
sentences focused on personal experiences and expressions of identity
New Auto-Interp
Negative Logits
ruta
-0.15
MBER
-0.14
ertino
-0.14
Dll
-0.14
lk
-0.14
routes
-0.14
Ñĥва
-0.13
Bulk
-0.13
avax
-0.13
prox
-0.13
POSITIVE LOGITS
aved
0.19
tanto
0.18
encounter
0.18
aves
0.17
encounters
0.16
æīĢ
0.16
Encounter
0.16
ivate
0.15
ÏĢο
0.15
bservice
0.15
Activations Density 0.235%