INDEX
Explanations
instances of actions or characteristics tied to a group of people
the verb "are" and its forms used in various contexts
New Auto-Interp
Negative Logits
pedia
-0.63
Which
-0.61
Rout
-0.60
ricks
-0.60
ffe
-0.60
ooters
-0.59
¨
-0.59
ization
-0.58
apology
-0.57
Solution
-0.56
POSITIVE LOGITS
wolves
0.91
supposed
0.78
nt
0.77
tein
0.73
held
0.68
wolf
0.68
supposedly
0.68
married
0.68
subscribed
0.67
behold
0.67
Activations Density 0.201%