INDEX
Explanations
words related to actions or activities
occurrences of the word "ACT" and variations related to actions or activities
New Auto-Interp
Negative Logits
nel
-0.71
kowski
-0.70
chau
-0.69
strips
-0.65
ley
-0.65
Mous
-0.64
hij
-0.63
lim
-0.62
Sloven
-0.62
strip
-0.62
POSITIVE LOGITS
ACT
4.31
ACTION
2.32
acts
2.16
act
2.05
ACTED
1.86
ACT
1.60
acting
1.42
acted
1.39
AC
1.32
Act
1.29
Activations Density 0.007%