INDEX
Explanations
personal pronouns followed by verbs indicating direction or movement
references to specific individuals and their locations or intentions
New Auto-Interp
Negative Logits
ĸļ
-0.65
uthor
-0.65
Reviewer
-0.65
advertisement
-0.64
naires
-0.64
odor
-0.61
Leilan
-0.61
chan
-0.61
oret
-0.60
ware
-0.60
POSITIVE LOGITS
originated
0.93
reside
0.90
belong
0.89
originate
0.88
resides
0.85
resided
0.84
geographically
0.82
origin
0.80
weakest
0.80
belongs
0.79
Activations Density 0.130%