INDEX
Explanations
mentions of interviews
repeated references to interviews
New Auto-Interp
Negative Logits
axy
-0.82
cil
-0.80
¶ħ
-0.76
mil
-0.74
math
-0.74
hover
-0.70
ple
-0.68
ignt
-0.66
fal
-0.66
warm
-0.65
POSITIVE LOGITS
interviews
1.06
interview
1.00
Interview
0.93
ees
0.92
Interview
0.87
ioned
0.86
interviewing
0.80
Transcript
0.78
transcripts
0.76
annel
0.75
Activations Density 0.020%