INDEX
Explanations
references to a specific entity or organization, specifically those that start with "Kon" or variations thereof
New Auto-Interp
Negative Logits
eon
-0.21
eel
-0.18
eum
-0.17
dana
-0.17
avana
-0.16
eled
-0.16
eous
-0.16
ois
-0.16
ece
-0.16
uteur
-0.16
POSITIVE LOGITS
igs
0.21
stant
0.21
rad
0.20
ig
0.18
ishi
0.18
ÏĥÏĦαν
0.18
ardy
0.18
ings
0.17
jac
0.16
tim
0.15
Activations Density 0.010%