INDEX
Explanations
pronouns that refer to a specific concept being mentioned
instances and references of the word "it."
New Auto-Interp
Negative Logits
ãĤ·ãĥ£
-0.71
76561
-0.62
Topics
-0.61
Aluminum
-0.58
Mans
-0.57
pi
-0.56
course
-0.56
Carbon
-0.56
Hanson
-0.55
Dam
-0.54
POSITIVE LOGITS
chy
0.93
enthusi
0.88
firsthand
0.85
recre
0.84
commercially
0.83
self
0.82
recip
0.79
selves
0.79
legitimately
0.76
anecd
0.76
Activations Density 0.175%