INDEX
Explanations
words related to physical actions or movements, particularly those involving physical contact or struggle
instances of words ending in 'les', 'led', or 'ling', indicating a focus on certain word forms
New Auto-Interp
Negative Logits
ĻĤ
-0.86
ÑĢ
-0.73
ãĥ¼ãĥ³
-0.68
EMP
-0.67
ITNESS
-0.65
л
-0.65
urd
-0.64
Inf
-0.63
OGR
-0.63
raviolet
-0.61
POSITIVE LOGITS
puff
1.12
ptin
0.98
cone
0.90
tail
0.85
jee
0.79
iness
0.78
kered
0.77
awkwardly
0.76
ishly
0.76
nervously
0.74
Activations Density 0.100%