INDEX
Explanations
occurrences of various forms of the verb "act"
New Auto-Interp
Negative Logits
erable
-0.17
ths
-0.15
оиÑĤ
-0.15
áce
-0.15
edy
-0.15
asters
-0.15
ackers
-0.14
itzer
-0.14
Ùĩ
-0.14
shima
-0.14
POSITIVE LOGITS
uate
0.24
uated
0.23
uating
0.21
uality
0.20
uator
0.17
uations
0.17
uation
0.17
ually
0.15
uar
0.15
ivia
0.15
Activations Density 0.039%