INDEX
Explanations
mentions or descriptions of small or seemingly insignificant things
occurrences of the word "little."
New Auto-Interp
Negative Logits
iership
-0.81
idents
-0.75
ï¸
-0.74
sem
-0.72
arching
-0.71
intent
-0.71
orial
-0.71
emale
-0.71
restling
-0.71
chwitz
-0.70
POSITIVE LOGITS
girl
1.05
boy
0.99
brother
0.94
girls
0.91
boys
0.91
guy
0.91
sister
0.89
helper
0.88
bit
0.87
snippets
0.87
Activations Density 0.031%