INDEX
Explanations
occurrences of the word "Read" and variations related to reading or sharing information
New Auto-Interp
Negative Logits
anner
-0.17
cÃŃ
-0.17
eldorf
-0.17
sten
-0.16
gia
-0.16
chner
-0.16
frica
-0.15
coni
-0.14
utherland
-0.14
andom
-0.14
POSITIVE LOGITS
more
0.36
ily
0.31
ers
0.31
iness
0.30
More
0.30
ym
0.28
ings
0.28
just
0.27
ying
0.27
mission
0.27
Activations Density 0.024%