INDEX
Explanations
locations or directions
repeated references to the pronoun "you" and phrases that indicate personal perspectives or positions
New Auto-Interp
Negative Logits
uthor
-0.65
advertisement
-0.64
yss
-0.63
avage
-0.62
Hacker
-0.59
Leilan
-0.59
chan
-0.58
naires
-0.58
ienne
-0.58
429
-0.56
POSITIVE LOGITS
weakest
0.93
reside
0.93
resided
0.91
resides
0.91
originated
0.80
originate
0.79
belong
0.79
happiest
0.78
ezvous
0.76
LIVE
0.75
Activations Density 0.265%