INDEX
Explanations
names of people, specifically focusing on a particular surname
occurrences of the term "aren" in the text
New Auto-Interp
Negative Logits
Labrador
-0.70
jerk
-0.69
extent
-0.66
srfAttach
-0.64
Fish
-0.61
degree
-0.59
measure
-0.58
Kinnikuman
-0.58
magnitude
-0.58
Accuracy
-0.57
POSITIVE LOGITS
aren
1.08
wen
0.91
isma
0.89
furt
0.88
stant
0.87
awan
0.84
sten
0.82
zen
0.81
wyn
0.81
wana
0.81
Activations Density 0.008%