INDEX
Explanations
phrases that denote representation or acting on someone else's behalf
New Auto-Interp
Negative Logits
sap
-0.48
痴
-0.48
wira
-0.46
ishi
-0.45
τας
-0.45
劣
-0.44
Tre
-0.43
-0.42
sap
-0.42
Gom
-0.41
POSITIVE LOGITS
mewakili
1.01
representing
0.98
behalf
0.94
Representing
0.90
representing
0.89
IsMutable
0.83
Represent
0.78
wakili
0.77
houſe
0.77
perſon
0.77
Activations Density 0.200%