INDEX
Explanations
numeric values that may represent significant data points or measures
New Auto-Interp
Negative Logits
swearing
-0.62
Sachs
-0.60
Lauder
-0.59
ment
-0.58
Goodman
-0.58
Cutter
-0.57
recognition
-0.57
Palest
-0.57
idence
-0.56
pitched
-0.56
POSITIVE LOGITS
96
1.26
76
1.24
90
1.22
86
1.21
87
1.20
73
1.19
54
1.19
89
1.18
63
1.18
56
1.18
Activations Density 0.230%