[paper review] A Critical Evaluation of Website Fingerprinting Attacks

Notice

Recent Posts

Recent Comments

Link

github

« 2025/08 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

Stand on the shoulders of giants

[paper review] A Critical Evaluation of Website Fingerprinting Attacks 본문

Paper reviews

[paper review] A Critical Evaluation of Website Fingerprinting Attacks

finallyupper 2024. 2. 19. 12:25

Goal

WF attacks가 실제로 가능할지를 확인
1. 가정 검증
2. 정확도에 실질적으로 영향을 주는 변수들 찾기
3. FP를 어떻게 줄일 수 있을지
4. 공격자 비용 모델링 (perfect WF system을 유지하는 것은 비용이 커서)

대표적 가정 2개

공격자와 유저 동엘 TBB(Tor Browser Bundle)
동일 localized version of a limited set of pages,sigtes 방문

Model

passive, local modelling
Assumption
- 유저의 traffic 모니터링 o, 나머지 조작x
- 복호화 불가능

Targeted of victim = 1
- victim과 비슷한 condition에서 모델 학습 가능
- 근데 실질적으로 유저의 setting아는건 어려움
Non-targeted (dragnet surveillance) of victim > 1

Assumption

basic model → 3 parts

Closed-world, browsing behavior, no background traffic and replicability.

Client-setting
- closed world = #of webpages = k개 (<< real)
- open-world
- browsing behaviour
  - 이론에서는 → user가 sequential하게 접속
  - 실제로는 → user가 한번에 multiple tabs 접속
Web
- Template websites = 웹사이트들의 기본 템플릿 동일 (HMM, Hidden Markov Model)
- ex. localized versions of the webpages → 언어 고정(german, english,…)
- but 여전히 dynamism의 여지가 많음
Adversary
- page load parsing
  - 처음과 끝 부분 detect 가능
- no background traffic
  - background network traffic을 filtering 가능
- replicabiliry
  - victim과 동일 condition에서 모델 학습 시킴.

⇒ 이 가정들을 세우는 과정에서나온 “변수들”의 영향력을 측정 !!!

Datasets

Alexa ranking
- 공격자가 user들의 방문 페이지를 알 수 있음
ALAD
- real world users

Methodoloy

관찰 변수 제외하고 나머지 변수들 픽싱( = control crawl디폴트, test crawl 관찰값)
왜 cross-validation 쓸까?*Tor의 경로 선택 알고리즘의 무작위성으로 인해 도입된 변동성을 감소시킴
= Tor의 path 선택 알고리즘과 시간의 영향력 이 두 control 변수들은 다른 변수들의 영향력을 측정할때 완전 픽싱할 수가 없어서
절차
1. k-fold cv of control crawl (⇒ baseline for comparison)
2. eval on control crawl, test from test crawl (⇒ compare with the above baseline)
SVM이 성능 가장 좋았음

Time

페이지의 내용이 바뀌기 때문에? 시간에 따라 패킷 분포 바뀜..?

⇒ 정확도가 시간이 지남에 따라 겁나 떨어짐

Multitab browsing

single이랑 비교하니까 그냥 완전 못하게됨… ㅠㅠ (delay가 고작 0.5초였는대도 불구하고)
learning model이 뭔지는 정확도에 그닥 영향 미치지 x
시간이 더 많이 걸리고 적게걸리고로 정확도에 영향이 있진 않음.(시간길이는 영향력이 없다는 거지)

Tor Browser Bundle(TBB) versoin

TBB 버전과 특징이 다를때의 영향력 평가
- TBB 버전이 너무 많아서 공격자가 유저꺼 아는게 쉽지 않으니까.
→ training과 testing에서 다른 TBB 버전 쓰는 것의 영향력 평가
TBB versions→ train, test의 버전 일치시키는게 효과적(3.5, 3.5.2.1버전)
- 2.4.7버전은 3.5랑 엮였을때 acc 겁나 낮음(둘이 차이가 많아서)
- countermeasure based on request randomization integrated in the TBB may not be effective

TBB properties
- 2개의 properties를 따름
  1. UseEntryGuards
    - enabled = 3개의 entry guard들 선택
    - disabled = 1개의 “
  2. NumEntryGuards
    - 기본값 3 of entry guards
- 1개의 entry guard로 fix했을때 std 가장 컸다.
⇒ e.g.를 fixing하는것보다 batch마다 다른 entry guard를 고르게할때 더 “balanced distribution”을 얻을 수 있었음( 각 batch가!!) → 분산 작음

Networks

공격자가 user와 동일한 internet connection을 갖는 것은 불가능

→ 다른 network location들을 쓸때 모델의 정확도는 어떻게 바뀔까?? 관찰!!

결과⇒ internet connection들은 internet backbone과 유사하다!
: Lauven은 대학꺼고 나머지 두개는 같은회사꺼 → Leuven꺼랑 정확도 개낮음

The importance of false positives

open-world scenario
- train
  - monitored & unmonitored 둘다 활용
The base rate fallacy
- 이전까지는 WF attack의 성공을 counting하는 accuracy기반 metric들 썼음⇒ base rate(=prior) 무시하는거임.
- *base rate = user가 monitored page를 방문할 확률
- == base rate fallacy (bias in the evaluation)
- BDR (Bayesian detection rate)ex. BDR = 0.4%라는건 0.4%확률로 맞게 분류하고 99.6%로 잘못 판단
  - 특징
    - uniform dist P(M) 가정
  - 근데 page들의 uniform 분포를 가정하게되면 유명한 페이지를 방문할 확률을 간과하게되겠지?→ 공격자는 사실 구글같은 사이트말고 유저가 자주 안들어가는 사이트 궁금할테니깐 prior값이 작을 것..
  - ⇒ 안유명한 monitored 대상으로 BDR값 구해봄(< 13.8%)
  - → 그래서 추가 통계 구해보았음.
  - 결론
  - We suspect that BDR for even more unpopular pages would be so low that would render a WF attack ineffective in this scenario.
- = 모델이 monitored로 판단할때 traffic trace가 진짜 monitored webpg일 확률

User's browsing habits

user 3명 골라서 실험해보니

low TPR
- 원인
- = inner pages : homepage에 없음
high FPR
- 모델은 학습셋에 없는 page들을 ‘Unknown’으로 출력할 수 없어서..
- → 학습셋에 있는 page들 중 테스트셋 page랑 가까운걸 출력함

Classify-verify

train을 A,B,C로하고 test가 다 D라고하자
모델은 어쩔수없이 test할때 A,B,C중에 답을낼수밖에 없다보니 false positive가 많을수밖에
Solution
- SVM classifier의 경우 추가 sigmoid함수 삽입해서 확률값 얻음→ threshold 적용해 filtering
- verification scores
  - P1 = 최대 확률값
  - Diff = P1 - P2 = 최대확률값 - 그다음최대확률
  - F_b score 사용 (f1대신)
- 결론
  1. Classify-Verify를 쓸때 TPR를 줄이지 않고도 FP갯수를 줄일 수 있다.
  2. 얘 쓰면 BDR가 두배가 된다. (그래도 작긴해)
  3. 완전히는 아니지만 어느정도 도움을 줌..!!

Modeling the adversary’s cost

기존 = 공격자가 유저에 대한 정보를 maximum으로 갖고 있는 시나리오만 고려함.
구글 홈페이지의 traffic footprint는 페이지에 embedded된 이미지들에 따라 상당히 달라진다!*
WF system 4 tasks
1. Data collection cost : col(D), D = n*m*i
  - cost = network, storage cost계산
  - n = # of training pages
  - m = # of versions of webpages
  - i = # of instances per webpage
2. training cost :
  - cost = feature F를 측정하고 classifier C를 학습시키는 비용
3. *c = the cost of training with a single instance of a traffic trace
4. testing cost: T=v*p , col(T)+test(T,F,C)
  - cost = test set T 수집 및 feature F 추출, 그리고 모델에 테스트하는 비용
  - v = # of monitored victims
  - p = avg of # pages accessed by each victim per day
  - T = # of test data
5. updating cost
  - 성능 유지를 위해 threshold(ex. 50%) tjfwjd
  - cost = 데이터 D 업데이트, feature F 측정, 모델 재학습 비용
  - d = d일 동안 페이지가 변함 ⇒ 평균 $$ \frac{update(D,F,C)}{d} $$
Total cost
- 시간이 모델 성능에 영향을 미쳤다
  - d값이 커지면 정확도가 50퍼보다 떨어졌음…
  - 시사하는바: 성공적인 WF attack은 불가능하다 이런식의 말은 못하지만 이러한 attack을 유지하는것이 expensive할거라는건 보여줌

Conclusion

: results showed that success of a WF adversary depends on many factors such as

temporal proximity of the training and testing traces,
TBB versions used for training and testing
users’ browsing habits, which are commonly oversimplified in the WF models.

ref16_ccs-webfp-final.pdf

2.89MB

'Paper reviews' 카테고리의 다른 글

[paper review] Fourmer: An Efficient Global Modeling Paradigm for Image Restoration (1)	2024.03.08
[paper review] FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model (0)	2024.03.08
[paper review] Measuring Information Leakage in Website Fingerprinting Attacks and Defenses (1)	2024.01.25
[paper review] Triplet Fingerprinting (0)	2024.01.21
[paper review] Realistic Website Fingerprinting By Augmenting Network Traces (1)	2024.01.15

'Paper reviews' Related Articles

Stand on the shoulders of giants

[paper review] A Critical Evaluation of Website Fingerprinting Attacks 본문

[paper review] A Critical Evaluation of Website Fingerprinting Attacks

Goal

대표적 가정 2개

Model

Assumption

Datasets

Methodoloy

Time

Multitab browsing

Tor Browser Bundle(TBB) versoin

Networks

The importance of false positives

User's browsing habits

Classify-verify

Modeling the adversary’s cost

Conclusion

'Paper reviews' 카테고리의 다른 글

티스토리툴바