INDEX
Explanations
references to age in relation to young individuals, particularly teenagers
New Auto-Interp
Negative Logits
ichick
-1.00
SHIP
-0.80
agre
-0.78
ipedia
-0.77
vernment
-0.77
destro
-0.77
oldown
-0.76
office
-0.75
acly
-0.74
TIME
-0.73
POSITIVE LOGITS
boy
0.94
girl
0.91
freshman
0.86
student
0.86
classmate
0.85
runaway
0.83
Afghan
0.82
sophomore
0.81
Nigerian
0.81
female
0.79
Activations Density 0.031%