INDEX
Explanations
phrases and structures related to personal experiences and opinions
New Auto-Interp
Negative Logits
ero
-0.17
aph
-0.17
-
-0.14
idle
-0.14
antic
-0.14
acc
-0.14
ore
-0.13
-c
-0.13
go
-0.13
á
-0.13
POSITIVE LOGITS
also
0.20
ALSO
0.19
also
0.19
také
0.18
Also
0.18
ÑĤакже
0.17
también
0.16
ÑĤакож
0.15
também
0.15
ëĺIJíķľ
0.15
Activations Density 0.418%