z230

2026-06-02 17:20:20 +02:00
parent ec187e673a
commit b433ef0446
58 changed files with 9247 additions and 0 deletions
@@ -0,0 +1,43 @@
+"Protocol","Study Population","Country","Site","Principal Investigator","Participant ID","Baseline Stool Frequency","Visit","Visit Date","Endoscopy Completed?","Endoscopy Date","Bowel Preparation Start Date 1","Bowel Preparation End Date 1","Bowel Preparation Start Date 2","Bowel Preparation End Date 2","Central Endoscopy Score","Local Endoscopy Score","PGA Score","Eligible Day (-1)","Day (-1) Excluded Reason(s)","Eligible Day (-2)","Day (-2) Excluded Reason(s)","Eligible Day (-3)","Day (-3) Excluded Reason(s)","Eligible Day (-4)","Day (-4) Excluded Reason(s)","Eligible Day (-5)","Day (-5) Excluded Reason(s)","Eligible Day (-6)","Day (-6) Excluded Reason(s)","Eligible Day (-7)","Day (-7) Excluded Reason(s)","Eligible Day (-8)","Day (-8) Excluded Reason(s)","Eligible Day (-9)","Day (-9) Excluded Reason(s)","Eligible Day (-10)","Day (-10) Excluded Reason(s)","Eligible Day (-1) Stool Count","Eligible Day (-2) Stool Count","Eligible Day (-3) Stool Count","Eligible Day (-4) Stool Count","Eligible Day (-5) Stool Count","Eligible Day (-6) Stool Count","Eligible Day (-7) Stool Count","Eligible Day (-8) Stool Count","Eligible Day (-9) Stool Count","Eligible Day (-10) Stool Count","Stool Frequency Sub-score","Eligible Day (-1) Rectal Bleeding Score","Eligible Day (-2) Rectal Bleeding Score","Eligible Day (-3) Rectal Bleeding Score","Eligible Day (-4) Rectal Bleeding Score","Eligible Day (-5) Rectal Bleeding Score","Eligible Day (-6) Rectal Bleeding Score","Eligible Day (-7) Rectal Bleeding Score","Eligible Day (-8) Rectal Bleeding Score","Eligible Day (-9) Rectal Bleeding Score","Eligible Day (-10) Rectal Bleeding Score","Rectal Bleeding Sub-score","Partial Mayo Score","Modified Mayo Score","Full Mayo Score","Site Action","Last Mayo Score Submission","Week I-12 Clinical Responder","Week I-12 Clinical Remission","Clinical Flare","Loss of Response","Partial Mayo Response Post Loss of Response","Partial Mayo Response for Clinical Non-Responders"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012001","1","I-0","19 Feb 2026","Yes","05 Feb 2026","04 Feb 2026","04 Feb 2026","-","-","2","-","3","18 Feb 2026","-","17 Feb 2026","-","16 Feb 2026","-","15 Feb 2026","-","14 Feb 2026","-","13 Feb 2026","-","12 Feb 2026","-","11 Feb 2026","Day Not Applicable for Calculation","10 Feb 2026","Day Not Applicable for Calculation","09 Feb 2026","Day Not Applicable for Calculation","10","8","7","5","7","8","8","-","-","-","3","1","1","1","0","1","1","1","-","-","-","1","7","6","9","-","08 Apr 2026 07:11:25","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012001","1","I-2","04 Mar 2026","-","-","-","-","-","-","-","-","3","03 Mar 2026","-","02 Mar 2026","-","01 Mar 2026","-","28 Feb 2026","-","27 Feb 2026","-","26 Feb 2026","-","25 Feb 2026","-","24 Feb 2026","Day Not Applicable for Calculation","23 Feb 2026","Day Not Applicable for Calculation","22 Feb 2026","Day Not Applicable for Calculation","5","4","5","4","5","6","6","-","-","-","2","1","0","1","0","1","0","1","-","-","-","1","6","","","-","28 May 2026 10:04:05","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012001","1","I-4","18 Mar 2026","-","-","-","-","-","-","-","-","2","17 Mar 2026","-","16 Mar 2026","-","15 Mar 2026","-","14 Mar 2026","-","13 Mar 2026","-","12 Mar 2026","-","11 Mar 2026","-","10 Mar 2026","Day Not Applicable for Calculation","09 Mar 2026","Day Not Applicable for Calculation","08 Mar 2026","Day Not Applicable for Calculation","5","5","5","4","5","4","5","-","-","-","2","1","0","0","1","1","1","0","-","-","-","1","5","","","-","08 Apr 2026 07:11:43","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012001","1","I-8","05 May 2026","-","-","-","-","-","-","-","-","1","04 May 2026","-","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","-","28 Apr 2026","-","27 Apr 2026","Day Not Applicable for Calculation","26 Apr 2026","Day Not Applicable for Calculation","25 Apr 2026","Day Not Applicable for Calculation","3","3","4","4","5","4","4","-","-","-","2","1","1","1","1","1","1","1","-","-","-","1","4","","","-","28 May 2026 14:42:53","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012001","1","I-12","13 May 2026","Yes","06 May 2026","05 May 2026","05 May 2026","-","-","1","-","1","12 May 2026","-","11 May 2026","-","10 May 2026","-","09 May 2026","-","08 May 2026","-","07 May 2026","-","06 May 2026","Endoscopy","05 May 2026","Bowel Preparation for Procedure;Day Not Applicable for Calculation","04 May 2026","-","03 May 2026","Day Not Applicable for Calculation","5","4","6","5","5","5","-","-","3","-","2","1","0","1","1","1","1","-","-","1","-","1","4","4","5","-","28 May 2026 14:43:11","Clinical Responder","No","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012002","1","I-0","08 Apr 2026","Yes","18 Mar 2026","17 Mar 2026","18 Mar 2026","-","-","2","-","2","07 Apr 2026","-","06 Apr 2026","-","05 Apr 2026","-","04 Apr 2026","Missing Diary","03 Apr 2026","-","02 Apr 2026","-","01 Apr 2026","-","31 Mar 2026","Day Not Applicable for Calculation","30 Mar 2026","Day Not Applicable for Calculation","29 Mar 2026","Day Not Applicable for Calculation","3","3","4","-","3","3","4","-","-","-","1","0","0","0","-","0","0","1","-","-","-","0","3","3","5","-","-","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012002","1","I-2","23 Apr 2026","-","-","-","-","-","-","-","-","2","22 Apr 2026","Missing Diary","21 Apr 2026","-","20 Apr 2026","-","19 Apr 2026","-","18 Apr 2026","-","17 Apr 2026","-","16 Apr 2026","-","15 Apr 2026","Day Not Applicable for Calculation","14 Apr 2026","Day Not Applicable for Calculation","13 Apr 2026","Day Not Applicable for Calculation","-","3","3","6","5","5","4","-","-","-","2","-","0","0","1","1","1","1","-","-","-","1","5","","","-","-","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012002","1","I-4","06 May 2026","-","-","-","-","-","-","-","-","1","05 May 2026","-","04 May 2026","-","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","-","28 Apr 2026","Day Not Applicable for Calculation","27 Apr 2026","Day Not Applicable for Calculation","26 Apr 2026","Day Not Applicable for Calculation","6","3","2","3","3","3","3","-","-","-","1","1","0","0","0","1","1","0","-","-","-","0","2","","","-","28 May 2026 14:43:38","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012003","1","I-0","27 May 2026","Yes","13 May 2026","12 May 2026","12 May 2026","-","-","3","-","2","26 May 2026","-","25 May 2026","-","24 May 2026","-","23 May 2026","-","22 May 2026","-","21 May 2026","-","20 May 2026","-","19 May 2026","Day Not Applicable for Calculation","18 May 2026","Day Not Applicable for Calculation","17 May 2026","Day Not Applicable for Calculation","6","9","7","8","9","7","8","-","-","-","3","2","2","2","2","1","1","1","-","-","-","2","7","8","10","-","27 May 2026 07:24:39","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10006","Michal Konecny","CZ100062001","1","I-0","20 Mar 2026","Yes","19 Feb 2026","-","-","-","-","3","-","3","19 Mar 2026","-","18 Mar 2026","-","17 Mar 2026","-","16 Mar 2026","-","15 Mar 2026","-","14 Mar 2026","-","13 Mar 2026","-","12 Mar 2026","Day Not Applicable for Calculation","11 Mar 2026","Day Not Applicable for Calculation","10 Mar 2026","Day Not Applicable for Calculation","7","7","8","8","7","8","5","-","-","-","3","2","1","1","1","1","1","0","-","-","-","1","7","7","10","-","20 Mar 2026 07:03:23","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10006","Michal Konecny","CZ100062001","1","I-2","08 Apr 2026","-","-","-","-","-","-","-","-","2","07 Apr 2026","Medication For Diarrhea","06 Apr 2026","Medication For Diarrhea","05 Apr 2026","Medication For Diarrhea","04 Apr 2026","Medication For Diarrhea","03 Apr 2026","Medication For Diarrhea","02 Apr 2026","Medication For Diarrhea","01 Apr 2026","Medication For Diarrhea","31 Mar 2026","Medication For Diarrhea;Day Not Applicable for Calculation","30 Mar 2026","Medication For Diarrhea;Day Not Applicable for Calculation","29 Mar 2026","Day Not Applicable for Calculation","-","-","-","-","-","-","-","-","-","-","Non-Evaluable","-","-","-","-","-","-","-","-","-","-","Non-Evaluable","Non-Evaluable","Non-Evaluable","Non-Evaluable","-","-","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10006","Michal Konecny","CZ100062001","1","I-4","15 Apr 2026","-","-","-","-","-","-","-","-","3","14 Apr 2026","-","13 Apr 2026","-","12 Apr 2026","-","11 Apr 2026","-","10 Apr 2026","-","09 Apr 2026","-","08 Apr 2026","-","07 Apr 2026","Medication For Diarrhea;Day Not Applicable for Calculation","06 Apr 2026","Medication For Diarrhea;Day Not Applicable for Calculation","05 Apr 2026","Medication For Diarrhea;Day Not Applicable for Calculation","9","22","20","19","17","18","18","-","-","-","3","1","3","2","2","2","2","2","-","-","-","2","8","","","-","04 May 2026 22:06:03","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10006","Michal Konecny","CZ100062001","1","I-8","18 May 2026","-","-","-","-","-","-","-","-","2","17 May 2026","-","16 May 2026","-","15 May 2026","-","14 May 2026","-","13 May 2026","-","12 May 2026","-","11 May 2026","-","10 May 2026","Day Not Applicable for Calculation","09 May 2026","Day Not Applicable for Calculation","08 May 2026","Day Not Applicable for Calculation","7","5","9","7","7","8","8","-","-","-","3","1","1","1","1","1","1","1","-","-","-","1","6","","","-","29 May 2026 15:44:46","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10006","Michal Konecny","CZ100062002","1","I-0","26 May 2026","Yes","14 May 2026","13 May 2026","13 May 2026","-","-","2","-","2","25 May 2026","-","24 May 2026","-","23 May 2026","-","22 May 2026","-","21 May 2026","-","20 May 2026","-","19 May 2026","-","18 May 2026","Day Not Applicable for Calculation","17 May 2026","Day Not Applicable for Calculation","16 May 2026","Day Not Applicable for Calculation","8","8","6","7","7","6","7","-","-","-","3","2","2","2","2","2","2","2","-","-","-","2","7","7","9","-","29 May 2026 15:45:00","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10009","Jiri Pumprla","CZ100092001","1","I-0","05 May 2026","Yes","24 Apr 2026","23 Apr 2026","23 Apr 2026","-","-","2","-","2","04 May 2026","-","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","-","28 Apr 2026","-","27 Apr 2026","Day Not Applicable for Calculation","26 Apr 2026","Day Not Applicable for Calculation","25 Apr 2026","Day Not Applicable for Calculation","5","5","5","5","5","5","5","-","-","-","2","1","1","1","1","1","1","1","-","-","-","1","5","5","7","-","05 May 2026 11:19:40","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10009","Jiri Pumprla","CZ100092001","1","I-2","19 May 2026","-","-","-","-","-","-","-","-","1","18 May 2026","-","17 May 2026","-","16 May 2026","-","15 May 2026","-","14 May 2026","-","13 May 2026","-","12 May 2026","-","11 May 2026","Day Not Applicable for Calculation","10 May 2026","Day Not Applicable for Calculation","09 May 2026","Day Not Applicable for Calculation","5","4","5","5","5","4","6","-","-","-","2","1","1","1","1","1","1","1","-","-","-","1","4","","","-","19 May 2026 10:38:25","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10012","Stefan Konecny","CZ100122001","5","I-0","07 Apr 2026","Yes","24 Mar 2026","22 Mar 2026","22 Mar 2026","-","-","2","-","2","06 Apr 2026","-","05 Apr 2026","-","04 Apr 2026","-","03 Apr 2026","-","02 Apr 2026","-","01 Apr 2026","-","31 Mar 2026","-","30 Mar 2026","Day Not Applicable for Calculation","29 Mar 2026","Day Not Applicable for Calculation","28 Mar 2026","Day Not Applicable for Calculation","8","11","5","9","11","10","13","-","-","-","3","1","2","2","2","2","2","2","-","-","-","2","7","7","9","-","04 May 2026 08:44:52","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10012","Stefan Konecny","CZ100122001","5","I-2","22 Apr 2026","-","-","-","-","-","-","-","-","2","21 Apr 2026","-","20 Apr 2026","-","19 Apr 2026","-","18 Apr 2026","-","17 Apr 2026","-","16 Apr 2026","-","15 Apr 2026","-","14 Apr 2026","Day Not Applicable for Calculation","13 Apr 2026","Day Not Applicable for Calculation","12 Apr 2026","Day Not Applicable for Calculation","7","5","6","6","7","8","2","-","-","-","1","1","0","1","1","1","2","0","-","-","-","1","4","","","-","04 May 2026 08:45:07","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10012","Stefan Konecny","CZ100122001","5","I-4","07 May 2026","-","-","-","-","-","-","-","-","1","06 May 2026","-","05 May 2026","-","04 May 2026","-","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","Day Not Applicable for Calculation","28 Apr 2026","Day Not Applicable for Calculation","27 Apr 2026","Day Not Applicable for Calculation","8","7","7","8","4","11","7","-","-","-","1","2","1","1","1","0","1","1","-","-","-","1","3","","","-","01 Jun 2026 00:57:35","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10013","David Stepek","CZ100132001","1","I-0","24 Mar 2026","Yes","12 Mar 2026","11 Mar 2026","11 Mar 2026","-","-","2","-","2","23 Mar 2026","-","22 Mar 2026","-","21 Mar 2026","-","20 Mar 2026","-","19 Mar 2026","-","18 Mar 2026","-","17 Mar 2026","-","16 Mar 2026","Day Not Applicable for Calculation","15 Mar 2026","Day Not Applicable for Calculation","14 Mar 2026","Day Not Applicable for Calculation","8","6","5","7","6","7","6","-","-","-","3","1","1","1","0","1","1","1","-","-","-","1","6","6","8","-","05 Apr 2026 22:41:27","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10013","David Stepek","CZ100132001","1","I-2","08 Apr 2026","-","-","-","-","-","-","-","-","2","07 Apr 2026","-","06 Apr 2026","-","05 Apr 2026","-","04 Apr 2026","-","03 Apr 2026","-","02 Apr 2026","-","01 Apr 2026","-","31 Mar 2026","Day Not Applicable for Calculation","30 Mar 2026","Day Not Applicable for Calculation","29 Mar 2026","Day Not Applicable for Calculation","5","2","3","6","5","5","5","-","-","-","2","0","0","0","0","1","1","0","-","-","-","0","4","","","-","28 May 2026 23:19:03","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10013","David Stepek","CZ100132001","1","I-4","21 Apr 2026","-","-","-","-","-","-","-","-","0","20 Apr 2026","-","19 Apr 2026","-","18 Apr 2026","-","17 Apr 2026","-","16 Apr 2026","-","15 Apr 2026","-","14 Apr 2026","-","13 Apr 2026","Day Not Applicable for Calculation","12 Apr 2026","Day Not Applicable for Calculation","11 Apr 2026","Day Not Applicable for Calculation","4","3","4","3","3","4","4","-","-","-","2","0","0","0","0","0","0","0","-","-","-","0","2","","","-","27 May 2026 12:54:41","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10013","David Stepek","CZ100132002","1","I-0","12 May 2026","Yes","21 Apr 2026","20 Apr 2026","21 Apr 2026","-","-","2","-","2","11 May 2026","-","10 May 2026","-","09 May 2026","-","08 May 2026","-","07 May 2026","-","06 May 2026","-","05 May 2026","Missing Diary","04 May 2026","Day Not Applicable for Calculation","03 May 2026","Day Not Applicable for Calculation","02 May 2026","Day Not Applicable for Calculation","2","1","1","1","1","2","-","-","-","-","0","0","0","0","0","0","0","-","-","-","-","0","2","2","4","-","28 May 2026 23:19:30","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10013","David Stepek","CZ100132002","1","I-2","26 May 2026","-","-","-","-","-","-","-","-","1","25 May 2026","-","24 May 2026","Missing Diary","23 May 2026","-","22 May 2026","-","21 May 2026","-","20 May 2026","-","19 May 2026","-","18 May 2026","Missing Diary;Day Not Applicable for Calculation","17 May 2026","Day Not Applicable for Calculation","16 May 2026","Day Not Applicable for Calculation","1","-","1","2","1","2","2","-","-","-","1","0","-","0","0","0","0","0","-","-","-","0","2","","","-","28 May 2026 23:19:51","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10013","David Stepek","CZ100132003","0","I-0","02 Jun 2026","Yes","25 May 2026","24 May 2026","24 May 2026","-","-","2","-","2","01 Jun 2026","-","31 May 2026","-","30 May 2026","-","29 May 2026","-","28 May 2026","-","27 May 2026","-","26 May 2026","-","25 May 2026","Endoscopy;Missing Diary;Day Not Applicable for Calculation","24 May 2026","Bowel Preparation for Procedure;Missing Diary;Day Not Applicable for Calculation","23 May 2026","Missing Diary;Day Not Applicable for Calculation","8","8","11","10","10","11","6","-","-","-","3","2","2","1","2","1","2","2","-","-","-","2","7","7","9","-","02 Jun 2026 08:17:40","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10016","Robert Mudr","CZ100162001","1","I-0","28 May 2026","Yes","19 May 2026","18 May 2026","19 May 2026","-","-","3","-","3","27 May 2026","-","26 May 2026","-","25 May 2026","-","24 May 2026","-","23 May 2026","-","22 May 2026","-","21 May 2026","-","20 May 2026","Day Not Applicable for Calculation","19 May 2026","Endoscopy;Bowel Preparation for Procedure;Day Not Applicable for Calculation","18 May 2026","Bowel Preparation for Procedure;Day Not Applicable for Calculation","14","15","15","15","15","15","15","-","-","-","3","2","3","3","2","2","3","3","-","-","-","3","9","9","12","-","28 May 2026 10:21:31","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adolescent","Czech Republic","DD5-CZ10020","Lucie Gonsorcikova","CZ100201001","1","Unscheduled 1","04 May 2026","Yes","20 Apr 2026","12 Apr 2026","15 Apr 2026","-","-","2","-","3","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","-","28 Apr 2026","-","27 Apr 2026","-","26 Apr 2026","Day Not Applicable for Calculation","25 Apr 2026","Day Not Applicable for Calculation","24 Apr 2026","Day Not Applicable for Calculation","5","6","6","7","6","3","3","-","-","-","2","0","0","0","0","0","0","0","-","-","-","0","5","4","7","-","-","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adolescent","Czech Republic","DD5-CZ10020","Lucie Gonsorcikova","CZ100201001","1","I-0","18 May 2026","Yes","01 May 2026","01 May 2026","01 May 2026","-","-","2","-","3","17 May 2026","-","16 May 2026","-","15 May 2026","-","14 May 2026","-","13 May 2026","-","12 May 2026","-","11 May 2026","-","10 May 2026","Day Not Applicable for Calculation","09 May 2026","Day Not Applicable for Calculation","08 May 2026","Day Not Applicable for Calculation","6","6","6","6","6","6","6","-","-","-","3","0","0","0","0","0","0","0","-","-","-","0","6","5","8","-","18 May 2026 08:36:37","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adolescent","Czech Republic","DD5-CZ10020","Lucie Gonsorcikova","CZ100201001","1","I-2","01 Jun 2026","-","-","-","-","-","-","-","-","3","31 May 2026","-","30 May 2026","Missing Diary","29 May 2026","Missing Diary","28 May 2026","Missing Diary","27 May 2026","-","26 May 2026","-","25 May 2026","-","24 May 2026","Day Not Applicable for Calculation","23 May 2026","Day Not Applicable for Calculation","22 May 2026","Day Not Applicable for Calculation","6","-","-","-","6","6","6","-","-","-","3","0","-","-","-","0","0","0","-","-","-","0","6","","","-","-","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10021","Martin Bortlik","CZ100212001","1","I-0","07 Apr 2026","Yes","16 Mar 2026","15 Mar 2026","16 Mar 2026","-","-","3","-","3","06 Apr 2026","-","05 Apr 2026","-","04 Apr 2026","-","03 Apr 2026","-","02 Apr 2026","-","01 Apr 2026","-","31 Mar 2026","-","30 Mar 2026","Day Not Applicable for Calculation","29 Mar 2026","Day Not Applicable for Calculation","28 Mar 2026","Day Not Applicable for Calculation","11","11","10","11","11","10","9","-","-","-","3","2","2","2","2","2","2","2","-","-","-","2","8","8","11","-","20 Apr 2026 09:27:58","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10021","Martin Bortlik","CZ100212001","1","I-2","20 Apr 2026","-","-","-","-","-","-","-","-","3","19 Apr 2026","-","18 Apr 2026","-","17 Apr 2026","-","16 Apr 2026","-","15 Apr 2026","-","14 Apr 2026","-","13 Apr 2026","-","12 Apr 2026","Day Not Applicable for Calculation","11 Apr 2026","Day Not Applicable for Calculation","10 Apr 2026","Day Not Applicable for Calculation","8","7","9","8","8","7","8","-","-","-","3","2","2","1","1","1","2","1","-","-","-","1","7","","","-","20 Apr 2026 09:29:01","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10021","Martin Bortlik","CZ100212001","1","I-4","05 May 2026","-","-","-","-","-","-","-","-","1","04 May 2026","-","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","-","28 Apr 2026","-","27 Apr 2026","Day Not Applicable for Calculation","26 Apr 2026","Day Not Applicable for Calculation","25 Apr 2026","Day Not Applicable for Calculation","6","6","6","6","7","7","6","-","-","-","3","0","0","1","1","1","1","1","-","-","-","1","5","","","-","-","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222002","1","I-0","19 Feb 2026","Yes","11 Feb 2026","10 Feb 2026","11 Feb 2026","-","-","2","-","2","18 Feb 2026","-","17 Feb 2026","-","16 Feb 2026","-","15 Feb 2026","-","14 Feb 2026","-","13 Feb 2026","-","12 Feb 2026","-","11 Feb 2026","Endoscopy;Bowel Preparation for Procedure;Day Not Applicable for Calculation","10 Feb 2026","Bowel Preparation for Procedure;Day Not Applicable for Calculation","09 Feb 2026","Day Not Applicable for Calculation","3","2","2","3","4","3","2","-","-","-","1","1","1","0","0","0","2","2","-","-","-","1","4","4","6","-","19 Feb 2026 15:24:43","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222003","1","I-0","09 Mar 2026","Yes","11 Feb 2026","10 Feb 2026","11 Feb 2026","-","-","2","-","2","08 Mar 2026","-","07 Mar 2026","-","06 Mar 2026","-","05 Mar 2026","-","04 Mar 2026","-","03 Mar 2026","Missing Diary","02 Mar 2026","Missing Diary","01 Mar 2026","Missing Diary;Day Not Applicable for Calculation","28 Feb 2026","Missing Diary;Day Not Applicable for Calculation","27 Feb 2026","Missing Diary;Day Not Applicable for Calculation","7","7","6","6","7","-","-","-","-","-","3","2","2","2","2","2","-","-","-","-","-","2","7","7","9","-","27 Mar 2026 07:27:49","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222003","1","I-2","27 Mar 2026","-","-","-","-","-","-","-","-","2","26 Mar 2026","-","25 Mar 2026","-","24 Mar 2026","-","23 Mar 2026","-","22 Mar 2026","-","21 Mar 2026","-","20 Mar 2026","-","19 Mar 2026","Day Not Applicable for Calculation","18 Mar 2026","Day Not Applicable for Calculation","17 Mar 2026","Day Not Applicable for Calculation","7","3","3","3","5","5","5","-","-","-","2","0","0","1","1","1","1","2","-","-","-","1","5","","","-","08 Apr 2026 07:36:56","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222003","1","I-4","08 Apr 2026","-","-","-","-","-","-","-","-","2","07 Apr 2026","-","06 Apr 2026","-","05 Apr 2026","-","04 Apr 2026","-","03 Apr 2026","-","02 Apr 2026","-","01 Apr 2026","-","31 Mar 2026","Day Not Applicable for Calculation","30 Mar 2026","Day Not Applicable for Calculation","29 Mar 2026","Day Not Applicable for Calculation","3","3","4","4","5","4","3","-","-","-","2","1","0","0","2","1","1","2","-","-","-","1","5","","","-","08 Apr 2026 07:59:35","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222003","1","I-8","04 May 2026","-","-","-","-","-","-","-","-","2","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","-","28 Apr 2026","-","27 Apr 2026","-","26 Apr 2026","Day Not Applicable for Calculation","25 Apr 2026","Day Not Applicable for Calculation","24 Apr 2026","Missing Diary;Day Not Applicable for Calculation","3","5","3","3","3","2","3","-","-","-","1","0","0","0","0","0","0","0","-","-","-","0","3","","","-","04 May 2026 08:08:40","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222003","1","I-12","01 Jun 2026","Yes","20 May 2026","19 May 2026","20 May 2026","-","-","3","-","2","31 May 2026","-","30 May 2026","-","29 May 2026","-","28 May 2026","-","27 May 2026","-","26 May 2026","-","25 May 2026","-","24 May 2026","Day Not Applicable for Calculation","23 May 2026","Day Not Applicable for Calculation","22 May 2026","Day Not Applicable for Calculation","4","4","6","3","3","3","3","-","-","-","2","1","1","2","1","1","1","2","-","-","-","1","5","6","8","-","01 Jun 2026 14:25:57","Clinical Nonresponder","No","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222005","1","I-0","09 Apr 2026","Yes","08 Apr 2026","31 Mar 2026","01 Apr 2026","-","-","2","-","2","08 Apr 2026","Endoscopy","07 Apr 2026","-","06 Apr 2026","-","05 Apr 2026","-","04 Apr 2026","-","03 Apr 2026","-","02 Apr 2026","-","01 Apr 2026","Bowel Preparation for Procedure;Day Not Applicable for Calculation","31 Mar 2026","Bowel Preparation for Procedure;Day Not Applicable for Calculation","30 Mar 2026","-","-","3","3","4","3","4","3","-","-","3","1","-","2","2","2","2","2","2","-","-","2","2","5","5","7","-","29 May 2026 11:07:08","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222005","1","I-2","22 Apr 2026","-","-","-","-","-","-","-","-","2","21 Apr 2026","-","20 Apr 2026","-","19 Apr 2026","-","18 Apr 2026","-","17 Apr 2026","-","16 Apr 2026","-","15 Apr 2026","-","14 Apr 2026","Day Not Applicable for Calculation","13 Apr 2026","Day Not Applicable for Calculation","12 Apr 2026","Day Not Applicable for Calculation","3","3","5","3","2","3","2","-","-","-","1","1","2","2","1","1","1","2","-","-","-","1","4","","","-","05 May 2026 15:00:39","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222005","1","I-4","05 May 2026","-","-","-","-","-","-","-","-","2","04 May 2026","-","03 May 2026","-","02 May 2026","-","01 May 2026","-","30 Apr 2026","-","29 Apr 2026","-","28 Apr 2026","-","27 Apr 2026","Day Not Applicable for Calculation","26 Apr 2026","Day Not Applicable for Calculation","25 Apr 2026","Day Not Applicable for Calculation","4","2","2","2","2","2","2","-","-","-","1","1","1","1","1","2","1","1","-","-","-","1","4","","","-","05 May 2026 07:30:02","N/A","N/A","N/A","N/A","N/A","N/A"
+"77242113UCO3001","Adult","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222005","1","I-8","02 Jun 2026","-","-","-","-","-","-","-","-","2","01 Jun 2026","-","31 May 2026","-","30 May 2026","-","29 May 2026","-","28 May 2026","-","27 May 2026","-","26 May 2026","-","25 May 2026","Day Not Applicable for Calculation","24 May 2026","Day Not Applicable for Calculation","23 May 2026","Day Not Applicable for Calculation","2","2","2","2","2","4","10","-","-","-","1","2","1","2","1","2","2","2","-","-","-","2","5","","","-","02 Jun 2026 08:19:16","N/A","N/A","N/A","N/A","N/A","N/A"
@@ -0,0 +1,6 @@
+"Protocol","Country","Site ID","PI_NAME","Subject Number","Age","Data Correction ID","Creation Date UTC","Status","Date of Last Action UTC","Total Open Period","Total Open Time (Days)","Current Status Time (Days)","Type","Next Action Required","Category","Query History","Reason for Change"
+"77242113UCO3001_ANALYSIS","Czech Republic The","CZ10001","Falc, Matej","CZ100012001","48 Years","16923867","14-May-2026","Escalated","26-May-2026","8-14 Days","12","4","QUERY","Site","Patient","(3) 15 May 2026 Clario:  You can upload scans of your paper ECGs using the Site Upload Tool. ---- Instructions can be found in the ""Reference Materials"" tab of the study portal. Please contact Customer Care for assistance if needed!","Data Checks"
+"77242113UCO3001_ANALYSIS","Czech Republic The","CZ10001","Falc, Matej","CZ100012001","48 Years","16567067","22-Jan-2026","Resolved","28-Jan-2026","4-7 Days","4","","QUERY","","Patient","MD Falc","Data Checks"
+"77242113UCO3001_ANALYSIS","Czech Republic The","CZ10009","Pumprla, Jiri","CZ100092001","49 Years","16776685","31-Mar-2026","Resolved","13-May-2026","Over 28 Days","29","","QUERY","","Patient","(2) 13 May 2026 Clario:  I confirm, that only ONE ECG was collected by mistake.","Data Checks"
+"77242113UCO3001_ANALYSIS","Czech Republic The","CZ10021","Bortlik, Martin","CZ100212001","61 Years","16717619","11-Mar-2026","Resolved","28-Apr-2026","Over 28 Days","32","","QUERY","","Patient","(2) 28 Apr 2026 Clario:  I confirmed that due to technical problems, the ECG was done only twice","Data Checks"
+"77242113UCO3001_ANALYSIS","Czech Republic The","CZ10022","Hrabak, Petr","CZ100222003","39 Years","16945114","21-May-2026","Escalated","27-May-2026","4-7 Days","7","3","DCR","Site","Patient","(6) 27 May 2026 Botdorf, Paul-Daniel:  We still do not have any ECGs for any patients at your site with a collection Date/Time of  20-May-2026 at  14:19:34, 14:20:32, 14:21:15. Please review the records in the portal and let us know if anything more is needed. If you see these ECGs, please double check that this is actually the study they are currently in(77242113UCO3001_ANALYSIS).Thank you",""
@@ -0,0 +1,173 @@
+"Protocol","Country","Site","PI Name","Subject ID","Age at Informed Consent","Baseline Stool Count","Confirm Baseline Stool Count","Data Correction ID","Creation Date UTC","Status","Description","Date of Last Action UTC","Total Open Period","Total Open Time (Days)","Current Status Time (Days)","Type","Next Action Required","Category","Query History","Reason for Change","Resolution"
+"77242113UCO3001","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012001","48","1","","SW00703544","13-May-2026","Submitted","Please change answer to clinical remision from no to YES (week 12).  Entry erros ","20-May-2026","8-14 Days","13","8","Query Active  ","Site","New","(1) 20 May 2026 msullivan (Clario): Please confirm your request
+
+Dear Site. Thank you for submitting this Data Clarification Request. 
+
+For us to process your request, please let us know the name of the form (with date) with question. 
+
+Thank you.  ERT/CLARIO   Data Coordination Team
+
+","Entry Error",""
+"77242113UCO3001","Czech Republic","DD5-CZ10001","Matej Falc","CZ100012002","79","1","","SW00696586","09-Apr-2026","ReadyForQC","Please correct date of endoscopy to date: 18 March 2026  (from 25 March 2026)","15-Apr-2026","Over 28 Days","35","31","Query Active  ","Site","Site-Entered Data","","Entry Error","CLARIO RESOLUTION:
+
+Part 1: In Mayo Subscore (1) dated 08 Apr 2026 for I-0 visit, CLARIO to make the following changes:
+- What was the date of endoscopy? (ENDODT1D): from 25 Mar 2026 to 18 Mar 2026
+- Data Flag (QSDFLG1B): from blank to check
+"
+"77242113UCO3001","Czech Republic","DD5-CZ10006","Michal Konecny","CZ100062001","19","1","","SW00704536","19-May-2026","ReadyForQC","Please change the endoscopy date to 19-FEB-2026. 06-MAR-2026 was entered in error. ","26-May-2026","8-14 Days","9","4","Query Active  ","Site","Site-Entered Data","","Entry Error","CLARIO RESOLUTION:
+
+Part 1: In Mayo Subscore (1) dated 20 Mar 2026 for I-0 visit, CLARIO to make the following changes:
+-What was the date of endoscopy? (ENDODT1D): from 06 Mar 2026 to 19 Feb 2026
+- Data Flag (QSDFLG1B): from blank to check
+"
+"77242113UCO3001","Czech Republic","DD5-CZ10012","Stefan Konecny","CZ100122001","22","5","Yes, I confirm this is the correct stool count.","SW00706684","01-Jun-2026","Submitted","The right endoscopy date is 23MAR2026, please change the date","01-Jun-2026","1 Day","1","1","","Clario DM","New","","Entry Error",""
+"77242113UCO3001","Czech Republic","DD5-CZ10013","David Stepek","CZ100132002","29","1","","SW00705646","26-May-2026","Submitted","Correct visit date  I-O is 12-May-2026.  All questionaries were filled on paper and entered in tablet later.
+Log-in issue. ","01-Jun-2026","4-7 Days","5","1","","Clario DM","New","(1) 01 Jun 2026 msullivan (Clario): Please confirm your request
+
+Dear Site. Thank you for submitting this Data Clarification. 
+
+     Please provide the timestamps for each of the assessments if you used paper forms and transcribed into the device. 
+     If unknown, ERT will use a dummy timestamp. 
+
+Thank you. ERT/CLARIO Data Coordination Team.  
+
+(2) 01 Jun 2026 dstepek@vnbrno.cz (Site User): time is unknown
+
+","Changed Information",""
+"77242113UCO3001","Czech Republic","DD5-CZ10013","David Stepek","CZ100132003","49","0","","SW00706581","29-May-2026","Submitted","baseline stool count reported by subject is 0, please change to 1 as per CRA request  (subject has 1 stool in 2-3 days if in remission)","29-May-2026","1 Day","1","1","","Clario DM","New","","Changed Information",""
+"77242113UCO3001","Czech Republic","DD5-CZ10016","Robert Mudr","CZ100162001","48","1","","SW00705916","27-May-2026","Submitted","As per ATS investigation (ATS26040111), please remove the below form which was entered as a duplicate    
+
+-  MAYO Diary (5) 24 Apr 2026","27-May-2026","4-7 Days","4","4","","Clario DM","New","","Technical Revision - Other",""
+"77242113UCO3001","Czech Republic","DD5-CZ10020","Lucie Gonsorcikova","CZ100201001","15","1","","SW00701729","06-May-2026","Completed","Dears, please delete data from visit I-0 (reported as 4th of May 2026) as this visit had to be postponed - see the previous DCR of this patient and change data request that was corrected. Patient has left the site before it was resolved and and new date of I-0 was planned. Patient continues to fill in his diary and patient is coming to I=0 visit within allowed window. We need the system and tablet to be ready to run new Mayo Score Report with updated and recent data (e.g. reflect new I-0 visit date, new eligible days -1 to -7.). 
+thank you, Jiri Skopek","19-May-2026","8-14 Days","8","","","","Visit Data","(1) 11 May 2026 msullivan (Clario): Please confirm your request
+
+Dear Site. Thank you for submitting this Data Clarification. 
+
+Please note that the delete forms are allowed if the reason is one of the following.
+If not, forms will move to unscheduled visit.
+
+Data collected by the wrong patient.
+Data collected by someone other than the patient.
+Data collected prior to informed consent, or after withdrawal from the study.
+Duplicate data erroneously entered at an Unscheduled visit via paper transcription.
+Data collected that is not expected per protocol.
+
+Also, I-0 visit is still ongoing. Please close the visit.
+Once the visit was closed, we will process accoridngly.
+
+Thank you.  ERT/CLARIO   Data Coordination Team
+
+(2) 11 May 2026 jskopek (Site User): Dears, 
+I do not see any option that is adequate -from the list. Data are not needed to be deleted fully, they reflect the situation at May4th. Please mark it as unscheduled visit - as exactly that is the case. We need the system to be ready for I-0 visit planned for next week. 
+I will close the visit tomorrow - do you mean in tablet/ipad? 
+Thank you very much for your help! Jiri  
+
+(3) 12 May 2026 venkata.ramana (Clario): Thank you for your response. 
+Please note that the visit I-0 was still ongoing but not closed yet.
+So please close the visit.
+Kind Regards, Clario Data Coordination Team.
+
+(4) 12 May 2026 jskopek (Site User): If I try to close the I-O visit in TABLET, it asks me if patient fulfils eligibility criteria to proceed to next visit based on these old data – if I answer NO, it asks me to DEACTIVATE patient. I do not want to DEACTIVATE patient – can you help WHERE and HOW to close this visit for you to change it to UNSCHEDULED and not to de-activate patient?
+Thank you Jiri
+
+
+","Other-delete visit I-0","CLARIO RESOLUTION:
+
+Part 1: In the following forms dated 04 May 2026, CLARIO to make the following changes:
+-Event ID: from I-0 to Unscheduled Visit 1
+-Event At Entry: from I-0 to Unscheduled Visit 1
+
+Visit Start (49)
+ePRO Availability (1)
+Mayo Subscore (1)
+PGA (1)
+
+Part 2: CLARIO to delete the following forms dated 04 May 2026 for I-0 visit.
+
+C-SSRS Since Last Visit (1)
+C-SSRS Since Last Visit Findings Report (1)
+
+Part 3: CLARIO to manually enter Visit End form for Unscheduled visit 1 with the following information:
+-Protocol: 77242113UCO3001
+-Report Date: 04 May 2026
+-Report Start Date and Time: 04 May 2026 23:59:59
+-Event ID: Unscheduled Visit 1
+-Event End Date: 04 May 2026 23:59:59
+-Visit Status: Incomplete
+-Phase At Entry: Screening
+-Phase At Entry Timestamp: 13 Apr 2026 12:32:20
+-Event At Entry: Unscheduled visit 1
+-Event Start Date: 04 May 2026 23:59:59
+-Event Time Zone Offset in Milliseconds: 7200000
+-Session Repeat Number (SESREP1N): 0
+-Session Instance Id (SESINST1S): 3f1214f0-4788-11f1-a0cf-bb403212adce
+"
+"77242113UCO3001","Czech Republic","DD5-CZ10020","Lucie Gonsorcikova","CZ100201001","15","1","","SW00701226","04-May-2026","Completed","Dears, we would like ask you to change the information I read on assignment form given by patient on April 13, 2026 (Visit 1), Baseline Stool Count (PT.Custom4)  as 3 that should be reported as 1. 
+Patient has entered wrong number as he did not understood it should be number of stools when illness is in remission or absent. He is a child and did not reflected this question correctly. Therefore, please change Baseline Stool Count = 1.
+Thank you, Jiri Skopek  ","04-May-2026","1 Day","1","","","","Demographic","","Changed Information","(Clario instructions)
+
+1. Please make below changes in the assignment form:
+
+Baseline Stool Count (PT. Custom4): 03 to 01."
+"77242113UCO3001","Czech Republic","DD5-CZ10021","Martin Bortlik","CZ100212001","61","1","","SW00699492","23-Apr-2026","ReadyForQC","Please correct the date of endoscopy done during screening visit of patient CZ100212001 to correct date 16-MAR-2026.","29-Apr-2026","22-28 Days","26","22","Query Active  ","Site","Site-Entered Data","","Changed Information","CLARIO RESOLUTION:
+
+Part 1: In the Mayo Subscore (1) dated 07 Apr 2026 for I-0 visit, CLARIO to make the following changes:
+-What was the date of endoscopy? (ENDODT1D): from 24 Mar 2026 to 16 Mar 2026
+- Data Flag (QSDFLG1B): from blank to check
+"
+"77242113UCO3001","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222003","39","1","","SW00703322","12-May-2026","Completed","As per ATS investigation (ATS26040111), please remove the below form that's been entered as a duplicate    
+
+- MAYO Diary (16) - 18 Mar 2026
+","20-May-2026","4-7 Days","6","","","","Technical Revision","","Technical Revision - Other","CLARIO RESOLUTION:
+
+Part 1: CLARIO to delete the MAYO Diary (16) dated 18 Mar 2026.
+"
+"77242113UCO3001","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222003","39","1","","SW00689748","09-Mar-2026","Completed","Dear all,
+
+Patient CZ 100222003 was randomized on 9 Mar 2026. Kindly correct the colonoscopy date to 11 Feb 2025.
+
+The date was initially entered as 21 Feb 2025 because the earlier date could not be entered in the system. The patient was rescreened.","02-Apr-2026","15-21 Days","17","","","","Site-Entered Data","(1) 13 Mar 2026 msullivan (Clario): Please confirm your request
+
+Dear Site. Thank you for submitting this Data Clarification. 
+
+Could you please conform that if you are requesting following?
+
+Mayo Subscore (1) dated 09 Mar 2026 for I-0 visit
+-What was the date of endoscopy? (ENDODT1D): from 23 Feb 2026 to 11 Feb 2025
+
+Could you please confirm the year? This subject was assigned on 02 Mar 2026, you are providing that correct date is 11 Feb 2025 which a year ago.
+If you are not requesting above, please provide us the name of the form with question. 
+
+Thank you.  ERT/CLARIO   Data Coordination Team
+
+
+(2) 13 Mar 2026 katerina.havlikova@clinoxus.com (Site User): confirm date of colonoscopy 11Feb2026
+
+(3) 21 Mar 2026 msullivan (Clario): Dear Site,
+
+The requested changes to the Mayo data have been updated. Please navigate to the Mayo Score Report and resubmit the form for visit to log the updated Mayo Score form. Once done, please respond to this query confirming that the Mayo Score has been resubmitted.
+
+Thank you.  ERT/CLARIO   Data Coordination Team
+
+(4) 24 Mar 2026 jana.pomahacova@clinoxus.com (Site User): Thank you and sent
+
+","New Information","CLARIO RESOLUTION:
+
+Part 1: In the Mayo Subscore (1) dated 09 Mar 2026 for I-0 visit, CLARIO to make the following changes:
+-What was the date of endoscopy? (ENDODT1D): from 23 Feb 2026 to 11 Feb 2025
+-Data Flag (QSDFLG1B): from blank to check"
+"77242113UCO3001","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222005","33","1","","SW00705372","22-May-2026","Submitted","Dear all, please change Colonoscopz date from 8April2026 to date 01Apr2026 Thank you in advance","29-May-2026","4-7 Days","6","1","Query Active  ","Site","New","(1) 29 May 2026 msullivan (Clario): Please confirm your request
+
+Dear Site. Thank you for submitting this Data Clarification. 
+
+Please provide us the name of the form for this request.
+
+Thank you.  ERT/CLARIO   Data Coordination Team
+
+","Changed Information",""
+"77242113UCO3001","Czech Republic","DD5-CZ10022","Petr Hrabak","CZ100222005","33","1","","SW00702538","08-May-2026","Completed","This TRR is to document the correction to the Mayo Subscore (1) form, where the following variables were populated with NULL values, due to a known core defect:
+Event At Entry, Event Start Date, Event Time Zone Offset in Milliseconds.","12-May-2026","2-3 Days","2","","","","Technical Revision","","Technical Revision - Other","Please make the below changes in Mayo Subscore (1) dated 22 Apr 2026:
+
+-Event At Entry: I-0
+-Event Start Date: 09 Apr 2026 08:09:19
+-Event Time Zone Offset in Milliseconds: 7200000"
@@ -0,0 +1,648 @@
+"""
+create_report.py
+Verze: 1.6
+Datum: 2026-06-02
+
+Generuje Excel report (.xlsm) pro studii 77242113UCO3001 z MongoDB databáze Clario.
+Výstup: U:/Dropbox/!!!Days/Downloads Z230/YYYY-MM-DD 77242113UCO3001 Clario Reports.xlsm
+
+Zdroj dat:
+  MongoDB 192.168.1.76, databáze Clario
+  Kolekce Clario.MayoScore  — skóre Mayo per pacient × visit
+  Kolekce Clario.MayoDiary  — denní záznamy deníku pacienta
+  Kolekce Clario.eCOA_DCRs  — data correction requests eCOA
+  Kolekce Clario.ECG_DCRs   — data correction requests ECG
+
+Listy:
+  MayoScore    — jeden řádek = pacient × visit
+                 sloupec „KLIKNI SEM" naviguje na filtrovaný EligibleDays
+                 řádky I-0 s Modified Mayo < 5 červeně tučně
+  MayoDiary    — jeden řádek = denní záznam deníku pacienta
+  EligibleDays — jeden řádek = jeden eligible day z MayoScore obohacený o data z MayoDiary;
+                 included/excluded flag, excluded dny šedě na žlutém pozadí
+  eCOA_DCRs    — všechna pole z kolekce Clario.eCOA_DCRs
+  ECG_DCRs     — všechna pole z kolekce Clario.ECG_DCRs
+
+VBA makro (Worksheet_SelectionChange na listu MayoScore):
+  Klik na sloupec „KLIKNI SEM" → přepne na EligibleDays a vyfiltruje záznamy
+  pro daného pacienta a visit. Vyžaduje povolení maker při otevření souboru.
+"""
+
+VERSION = "1.6"
+
+from datetime import datetime
+from pathlib import Path
+import time
+
+from pymongo import MongoClient
+from openpyxl import Workbook
+from openpyxl.styles import Font, PatternFill, Alignment, Border, Side
+from openpyxl.utils import get_column_letter
+import xlwings as xw
+
+# ---------------------------------------------------------------------------
+# Konfigurace
+# ---------------------------------------------------------------------------
+
+MONGO_URI = "mongodb://192.168.1.76:27017"
+DB_NAME = "Clario"
+OUTPUT_DIR = Path(r"U:\Dropbox\!!!Days\Downloads Z230")
+
+VISIT_ORDER = ["I-0", "I-2", "I-4", "I-8", "I-12"]
+
+COLUMNS_SCORE = [
+    ("KLIKNI SEM",                  lambda d: "▶  klikni sem"),
+    ("Site",                        lambda d: d.get("site", {}).get("name", "")),
+    ("Subject ID",                  lambda d: d.get("subject", {}).get("id", "")),
+    ("Visit",                       lambda d: d["fields"].get("Visit", "")),
+    ("Visit Date",                  lambda d: d["fields"].get("Visit Date", "")),
+    ("Baseline Stool Frequency",    lambda d: _num(d["fields"].get("Baseline Stool Frequency", ""))),
+    ("Central Endoscopy Score",     lambda d: _num(d["fields"].get("Central Endoscopy Score", ""))),
+    ("PGA Score",                   lambda d: _num(d["fields"].get("PGA Score", ""))),
+    ("Stool Frequency Sub-score",   lambda d: _num(d["fields"].get("Stool Frequency Sub-score", ""))),
+    ("Rectal Bleeding Sub-score",   lambda d: _num(d["fields"].get("Rectal Bleeding Sub-score", ""))),
+    ("Partial Mayo Score",          lambda d: _num(d["fields"].get("Partial Mayo Score", ""))),
+    ("Modified Mayo Score",         lambda d: _num(d["fields"].get("Modified Mayo Score", ""))),
+    ("Full Mayo Score",             lambda d: _num(d["fields"].get("Full Mayo Score", ""))),
+    ("Site Action",                 lambda d: d.get("Site Action") or ""),
+    ("Last Mayo Score Submission",  lambda d: d.get("Last Mayo Score Submission") or ""),
+    ("Wk I-12 Responder",          lambda d: d.get("Week I-12 Clinical Responder") or ""),
+    ("Wk I-12 Remission",          lambda d: d.get("Week I-12 Clinical Remission") or ""),
+    ("Clinical Flare",             lambda d: d.get("Clinical Flare") or ""),
+    ("Loss of Response",           lambda d: d.get("Loss of Response") or ""),
+    ("Partial Mayo Post LoR",      lambda d: d.get("Partial Mayo Response Post Loss of Response") or ""),
+    ("Partial Mayo Non-Resp",      lambda d: d.get("Partial Mayo Response for Clinical Non-Responders") or ""),
+]
+
+COLUMNS_DIARY = [
+    ("Subject ID",              lambda d: d.get("subject", {}).get("id", "")),
+    ("Report Date",             lambda d: d["fields"].get("Report Date", "")),
+    ("Baseline Stool Count",    lambda d: _num(d["fields"].get("Baseline Stool Count", ""))),
+    ("Stool Frequency",         lambda d: _num(d["fields"].get("Stool Frequency", ""))),
+    ("MAYO050",                 lambda d: d["fields"].get("MAYO050", "")),
+    ("Not Applicable",          lambda d: d["fields"].get("Not Applicable", "")),
+    ("Constipation",            lambda d: d["fields"].get("Constipation", "")),
+    ("Diarrhea",                lambda d: d["fields"].get("Diarrhea", "")),
+    ("Irregularity",            lambda d: d["fields"].get("Irregularity", "")),
+]
+
+COLUMNS_ECOA_DCRS = [
+    ("Site",                        lambda d: d.get("site", {}).get("name", "")),
+    ("Subject ID",                  lambda d: d.get("subject", {}).get("id", "")),
+    ("Data Correction ID",          lambda d: d["fields"].get("Data Correction ID", "")),
+    ("PI Name",                     lambda d: d["fields"].get("PI Name", "")),
+    ("Creation Date UTC",           lambda d: d["fields"].get("Creation Date UTC", "")),
+    ("Date of Last Action UTC",     lambda d: d["fields"].get("Date of Last Action UTC", "")),
+    ("Status",                      lambda d: d["fields"].get("Status", "")),
+    ("Type",                        lambda d: d["fields"].get("Type", "")),
+    ("Next Action Required",        lambda d: d["fields"].get("Next Action Required", "")),
+    ("Category",                    lambda d: d["fields"].get("Category", "")),
+    ("Total Open Period",           lambda d: d["fields"].get("Total Open Period", "")),
+    ("Total Open Time (Days)",      lambda d: _num(d["fields"].get("Total Open Time (Days)", ""))),
+    ("Current Status Time (Days)",  lambda d: _num(d["fields"].get("Current Status Time (Days)", ""))),
+    ("Reason for Change",           lambda d: d["fields"].get("Reason for Change", "")),
+    ("Description",                 lambda d: d["fields"].get("Description", "")),
+    ("Resolution",                  lambda d: d["fields"].get("Resolution", "")),
+    ("Query History",               lambda d: d["fields"].get("Query History", "")),
+    ("Age at Informed Consent",     lambda d: d["fields"].get("Age at Informed Consent", "")),
+    ("Baseline Stool Count",        lambda d: _num(d["fields"].get("Baseline Stool Count", ""))),
+    ("firstSeen",                   lambda d: d.get("firstSeen", "")),
+    ("lastSeen",                    lambda d: d.get("lastSeen", "")),
+]
+
+COLUMNS_ECG_DCRS = [
+    ("Site ID",                     lambda d: d.get("site", {}).get("name", "")),
+    ("Subject Number",              lambda d: d.get("subject", {}).get("id", "")),
+    ("Data Correction ID",          lambda d: d["fields"].get("Data Correction ID", "")),
+    ("PI Name",                     lambda d: d["fields"].get("PI_NAME", "")),
+    ("Age",                         lambda d: d["fields"].get("Age", "")),
+    ("Creation Date UTC",           lambda d: d["fields"].get("Creation Date UTC", "")),
+    ("Date of Last Action UTC",     lambda d: d["fields"].get("Date of Last Action UTC", "")),
+    ("Status",                      lambda d: d["fields"].get("Status", "")),
+    ("Type",                        lambda d: d["fields"].get("Type", "")),
+    ("Next Action Required",        lambda d: d["fields"].get("Next Action Required", "")),
+    ("Category",                    lambda d: d["fields"].get("Category", "")),
+    ("Total Open Period",           lambda d: d["fields"].get("Total Open Period", "")),
+    ("Total Open Time (Days)",      lambda d: _num(d["fields"].get("Total Open Time (Days)", ""))),
+    ("Current Status Time (Days)",  lambda d: _num(d["fields"].get("Current Status Time (Days)", ""))),
+    ("Reason for Change",           lambda d: d["fields"].get("Reason for Change", "")),
+    ("Query History",               lambda d: d["fields"].get("Query History", "")),
+    ("firstSeen",                   lambda d: d.get("firstSeen", "")),
+    ("lastSeen",                    lambda d: d.get("lastSeen", "")),
+]
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _num(value):
+    """Převede číselný string na int, jinak vrátí původní hodnotu nebo None."""
+    if value == "" or value is None:
+        return None
+    try:
+        return int(value)
+    except (ValueError, TypeError):
+        try:
+            return float(value)
+        except (ValueError, TypeError):
+            return value
+
+
+def _visit_sort_key(doc):
+    visit = doc["fields"].get("Visit", "")
+    try:
+        idx = VISIT_ORDER.index(visit)
+    except ValueError:
+        idx = len(VISIT_ORDER)
+    return (doc.get("site", {}).get("name", ""), doc.get("subject", {}).get("id", ""), idx, visit)
+
+
+def _iso_to_date(value):
+    """ISO string → Python date pro Excel."""
+    if not isinstance(value, str):
+        return value
+    try:
+        return datetime.fromisoformat(value).date()
+    except ValueError:
+        return value
+
+
+# ---------------------------------------------------------------------------
+# Styly
+# ---------------------------------------------------------------------------
+
+HEADER_FILL = PatternFill("solid", fgColor="1F497D")
+HEADER_FONT = Font(bold=True, color="FFFFFF", size=10)
+CELL_FONT   = Font(size=10)
+ALIGN_CTR   = Alignment(horizontal="center", vertical="center", wrap_text=False)
+ALIGN_LEFT  = Alignment(horizontal="left",   vertical="center")
+
+THIN = Side(style="thin", color="BFBFBF")
+BORDER = Border(left=THIN, right=THIN, top=THIN, bottom=THIN)
+
+# zebra
+FILL_ODD  = PatternFill("solid", fgColor="FFFFFF")
+FILL_EVEN = PatternFill("solid", fgColor="EBF1DE")
+
+# DCR status barvy
+FILL_DCR_SITE    = PatternFill("solid", fgColor="FFFF00")   # žlutá  — čeká lékař
+FILL_DCR_CLARIO  = PatternFill("solid", fgColor="BDD7EE")   # modrá  — čeká Clario
+FILL_DCR_QC      = PatternFill("solid", fgColor="F4B942")   # oranžová — ReadyForQC
+FILL_DCR_DONE    = PatternFill("solid", fgColor="FFFFFF")   # bílá   — Completed
+
+SCORE_COLS = {"Partial Mayo Score", "Modified Mayo Score", "Full Mayo Score"}
+SCORE_FILL = PatternFill("solid", fgColor="FFC7CE")   # červená pro skóre ≥ 5 (placeholder — nepoužíváme podmíněné formátování)
+
+
+# ---------------------------------------------------------------------------
+# Sestavení sheetu
+# ---------------------------------------------------------------------------
+
+def _build_sheet(ws, docs, columns, date_cols, center_cols, col_widths, row_font_fn=None, wrap_cols=None, header_row=1):
+    headers = [c[0] for c in columns]
+
+    for col_idx, header in enumerate(headers, 1):
+        cell = ws.cell(row=header_row, column=col_idx, value=header)
+        cell.font = HEADER_FONT
+        cell.fill = HEADER_FILL
+        cell.alignment = ALIGN_CTR
+        cell.border = BORDER
+    ws.row_dimensions[header_row].height = 28
+
+    data_start = header_row + 1
+    for row_idx, doc in enumerate(docs, data_start):
+        fill = FILL_EVEN if (row_idx - header_row) % 2 == 0 else FILL_ODD
+        font = row_font_fn(doc) if row_font_fn else CELL_FONT
+        for col_idx, (col_name, getter) in enumerate(columns, 1):
+            value = getter(doc)
+            if col_name in date_cols and isinstance(value, str):
+                value = _iso_to_date(value)
+            cell = ws.cell(row=row_idx, column=col_idx, value=value)
+            cell.font = font
+            cell.fill = fill
+            cell.border = BORDER
+            if wrap_cols and col_name in wrap_cols:
+                cell.alignment = Alignment(horizontal="left", vertical="top", wrap_text=True)
+            else:
+                cell.alignment = ALIGN_CTR if col_name in center_cols else ALIGN_LEFT
+
+    for col_idx, (col_name, _) in enumerate(columns, 1):
+        ws.column_dimensions[get_column_letter(col_idx)].width = col_widths.get(col_name, 14)
+
+    for col_name in date_cols:
+        if col_name in headers:
+            letter = get_column_letter(headers.index(col_name) + 1)
+            for row_idx in range(data_start, len(docs) + data_start):
+                ws[f"{letter}{row_idx}"].number_format = "DD-MMM-YYYY"
+
+    ws.freeze_panes = f"A{data_start}"
+    ws.auto_filter.ref = f"A{header_row}:{get_column_letter(len(headers))}{header_row}"
+
+
+def _score_row_font(doc):
+    visit = doc["fields"].get("Visit", "")
+    try:
+        mod_mayo = int(doc["fields"].get("Modified Mayo Score", ""))
+    except (ValueError, TypeError):
+        mod_mayo = None
+    if visit == "I-0" and mod_mayo is not None and mod_mayo < 5:
+        return Font(size=10, bold=True, color="FF0000")
+    return CELL_FONT
+
+
+def build_mayo_score_sheet(ws, docs):
+    _build_sheet(
+        ws, docs, COLUMNS_SCORE,
+        date_cols={"Visit Date", "Last Mayo Score Submission"},
+        center_cols={"KLIKNI SEM", "Visit", "Central Endoscopy Score", "PGA Score",
+                     "Stool Frequency Sub-score", "Rectal Bleeding Sub-score",
+                     "Partial Mayo Score", "Modified Mayo Score", "Full Mayo Score",
+                     "Baseline Stool Frequency",
+                     "Wk I-12 Responder", "Wk I-12 Remission", "Clinical Flare",
+                     "Loss of Response", "Partial Mayo Post LoR", "Partial Mayo Non-Resp",
+                     "Last Mayo Score Submission"},
+        col_widths={
+            "KLIKNI SEM": 14,
+            "Site": 18, "Subject ID": 16, "Visit": 12, "Visit Date": 14,
+            "Baseline Stool Frequency": 14, "Central Endoscopy Score": 14,
+            "PGA Score": 10, "Stool Frequency Sub-score": 14,
+            "Rectal Bleeding Sub-score": 14, "Partial Mayo Score": 14,
+            "Modified Mayo Score": 14, "Full Mayo Score": 13,
+            "Site Action": 22, "Last Mayo Score Submission": 16,
+            "Wk I-12 Responder": 14, "Wk I-12 Remission": 14,
+            "Clinical Flare": 14, "Loss of Response": 14,
+            "Partial Mayo Post LoR": 20, "Partial Mayo Non-Resp": 20,
+        },
+        row_font_fn=_score_row_font,
+    )
+    # Speciální styl pro sloupec KLIKNI SEM — vypadá jako tlačítko/odkaz
+    link_font = Font(size=10, bold=True, color="FFFFFF")
+    link_fill = PatternFill("solid", fgColor="2E75B6")
+    for row in range(2, len(docs) + 2):
+        cell = ws.cell(row=row, column=1)
+        cell.font = link_font
+        cell.fill = link_fill
+        cell.alignment = ALIGN_CTR
+
+
+def build_mayo_diary_sheet(ws, docs):
+    _build_sheet(
+        ws, docs, COLUMNS_DIARY,
+        date_cols={"Report Date"},
+        center_cols={"Baseline Stool Count", "Stool Frequency", "Not Applicable",
+                     "Constipation", "Diarrhea", "Irregularity"},
+        col_widths={
+            "Subject ID": 16, "Report Date": 14, "Baseline Stool Count": 14,
+            "Stool Frequency": 14, "MAYO050": 48, "Not Applicable": 14,
+            "Constipation": 14, "Diarrhea": 12, "Irregularity": 14,
+        },
+    )
+
+
+def build_eligible_days_sheet(ws, score_docs, diary_docs):
+    # Lookup diary records by (subject_id, date_part YYYY-MM-DD)
+    diary_lookup: dict[tuple, dict] = {}
+    for d in diary_docs:
+        subj = d.get("subject", {}).get("id", "")
+        date_iso = d["fields"].get("Report Date", "")
+        date_part = date_iso[:10] if date_iso else ""
+        if subj and date_part:
+            diary_lookup[(subj, date_part)] = d
+
+    headers = [
+        "Included", "Subject ID", "Visit", "Visit Date", "Day",
+        "Report Date", "Baseline Stool Count", "Stool Frequency",
+        "MAYO050", "Not Applicable", "Constipation", "Diarrhea", "Irregularity",
+    ]
+    col_widths = {
+        "Included": 10, "Subject ID": 16, "Visit": 10, "Visit Date": 14, "Day": 8,
+        "Report Date": 14, "Baseline Stool Count": 14, "Stool Frequency": 14,
+        "MAYO050": 48, "Not Applicable": 14, "Constipation": 14,
+        "Diarrhea": 12, "Irregularity": 14,
+    }
+    center_cols = {"Included", "Visit", "Day", "Baseline Stool Count", "Stool Frequency",
+                   "Not Applicable", "Constipation", "Diarrhea", "Irregularity"}
+    date_cols = {"Visit Date", "Report Date"}
+    no_fill = PatternFill("solid", fgColor="FFF2CC")  # žlutá pro excluded dny
+
+    for col_idx, header in enumerate(headers, 1):
+        cell = ws.cell(row=1, column=col_idx, value=header)
+        cell.font = HEADER_FONT
+        cell.fill = HEADER_FILL
+        cell.alignment = ALIGN_CTR
+        cell.border = BORDER
+    ws.row_dimensions[1].height = 28
+
+    row_idx = 2
+    for score_doc in score_docs:
+        subj = score_doc.get("subject", {}).get("id", "")
+        visit = score_doc["fields"].get("Visit", "")
+        visit_date = score_doc["fields"].get("Visit Date", "")
+
+        for n in range(1, 11):
+            day_date_iso = score_doc["fields"].get(f"Eligible Day (-{n})")
+            if not day_date_iso or day_date_iso == "-":
+                continue
+            date_part = day_date_iso[:10]
+            excl_reason = score_doc["fields"].get(f"Day (-{n}) Excluded Reason(s)", "")
+            included = "No" if excl_reason and excl_reason != "-" else "Yes"
+
+            diary = diary_lookup.get((subj, date_part), {})
+            df = diary.get("fields", {})
+
+            fill = no_fill if included == "No" else (FILL_EVEN if row_idx % 2 == 0 else FILL_ODD)
+            font = Font(size=10, color="808080") if included == "No" else CELL_FONT
+
+            values = [
+                included,
+                subj,
+                visit,
+                _iso_to_date(visit_date) if isinstance(visit_date, str) else visit_date,
+                f"-{n}",
+                _iso_to_date(day_date_iso),
+                _num(df.get("Baseline Stool Count", "")),
+                _num(df.get("Stool Frequency", "")),
+                df.get("MAYO050", ""),
+                df.get("Not Applicable", ""),
+                df.get("Constipation", ""),
+                df.get("Diarrhea", ""),
+                df.get("Irregularity", ""),
+            ]
+
+            for col_idx, (header, value) in enumerate(zip(headers, values), 1):
+                cell = ws.cell(row=row_idx, column=col_idx, value=value)
+                cell.font = font
+                cell.fill = fill
+                cell.border = BORDER
+                if header in date_cols:
+                    cell.number_format = "DD-MMM-YYYY"
+                cell.alignment = ALIGN_CTR if header in center_cols else ALIGN_LEFT
+
+            row_idx += 1
+
+    for col_idx, header in enumerate(headers, 1):
+        ws.column_dimensions[get_column_letter(col_idx)].width = col_widths.get(header, 14)
+
+    ws.freeze_panes = "A2"
+    ws.auto_filter.ref = f"A1:{get_column_letter(len(headers))}1"
+
+
+def _build_dcr_legend(ws):
+    """Vloží legendu do řádků 1–4, prázdný řádek 5. Data začínají od řádku 6."""
+    legend = [
+        (FILL_DCR_SITE,   "Čeká lékař — Next Action Required = Site (lékař musí odpovědět nebo potvrdit)"),
+        (FILL_DCR_CLARIO, "Čeká Clario — Next Action Required = Clario DM (Clario dostalo podklady, provede změnu)"),
+        (FILL_DCR_QC,     "ReadyForQC — Clario provedlo změny, čeká na finální QC kontrolu"),
+        (FILL_DCR_DONE,   "Completed / Resolved — DCR je uzavřen"),
+    ]
+    for i, (fill, text) in enumerate(legend, 1):
+        a = ws.cell(row=i, column=1, value="")
+        a.fill = fill
+        a.border = BORDER
+        b = ws.cell(row=i, column=2, value=text)
+        b.font = Font(size=10, bold=True)
+        b.alignment = ALIGN_LEFT
+    # řádek 5 prázdný — nic nedělat
+
+
+def _dcr_row_fill(doc):
+    """Vrátí fill barvu dle stavu DCR."""
+    status = doc["fields"].get("Status", "")
+    next_action = doc["fields"].get("Next Action Required", "")
+    if status in ("Completed", "Resolved"):
+        return FILL_DCR_DONE
+    if status == "ReadyForQC":
+        return FILL_DCR_QC
+    if "Site" in next_action:
+        return FILL_DCR_SITE
+    if "Clario" in next_action or next_action == "":
+        return FILL_DCR_CLARIO
+    return FILL_ODD
+
+
+def build_ecoa_dcrs_sheet(ws, docs):
+    _build_dcr_legend(ws)
+    docs_sorted = sorted(docs, key=lambda d: (
+        d.get("site", {}).get("name", ""),
+        d.get("subject", {}).get("id", ""),
+        d["fields"].get("Creation Date UTC", ""),
+    ))
+    _build_sheet(
+        ws, docs_sorted, COLUMNS_ECOA_DCRS,
+        date_cols={"Creation Date UTC", "Date of Last Action UTC"},
+        center_cols={"Status", "Type", "Next Action Required", "Category",
+                     "Total Open Time (Days)", "Current Status Time (Days)",
+                     "Baseline Stool Count", "firstSeen", "lastSeen"},
+        col_widths={
+            "Site": 16, "Subject ID": 16, "Data Correction ID": 18,
+            "PI Name": 18, "Creation Date UTC": 14, "Date of Last Action UTC": 14,
+            "Status": 14, "Type": 16, "Next Action Required": 16, "Category": 20,
+            "Total Open Period": 14, "Total Open Time (Days)": 14,
+            "Current Status Time (Days)": 16, "Reason for Change": 20,
+            "Description": 50, "Resolution": 50, "Query History": 60,
+            "Age at Informed Consent": 14, "Baseline Stool Count": 14,
+            "firstSeen": 12, "lastSeen": 12,
+        },
+        wrap_cols={"Reason for Change", "Description", "Resolution", "Query History"},
+        header_row=6,
+        row_font_fn=lambda doc: CELL_FONT,
+    )
+    # Přebarvení řádků dle DCR stavu (přepíše zebra fill)
+    data_start = 7
+    for row_idx, doc in enumerate(docs_sorted, data_start):
+        fill = _dcr_row_fill(doc)
+        for col_idx in range(1, len(COLUMNS_ECOA_DCRS) + 1):
+            ws.cell(row=row_idx, column=col_idx).fill = fill
+
+
+def build_ecg_dcrs_sheet(ws, docs):
+    _build_dcr_legend(ws)
+    docs_sorted = sorted(docs, key=lambda d: (
+        d.get("site", {}).get("name", ""),
+        d.get("subject", {}).get("id", ""),
+        d["fields"].get("Creation Date UTC", ""),
+    ))
+    _build_sheet(
+        ws, docs_sorted, COLUMNS_ECG_DCRS,
+        date_cols={"Creation Date UTC", "Date of Last Action UTC"},
+        center_cols={"Status", "Type", "Next Action Required", "Category",
+                     "Total Open Time (Days)", "Current Status Time (Days)",
+                     "firstSeen", "lastSeen"},
+        col_widths={
+            "Site ID": 14, "Subject Number": 16, "Data Correction ID": 16,
+            "PI Name": 18, "Age": 10, "Creation Date UTC": 14,
+            "Date of Last Action UTC": 14, "Status": 14, "Type": 12,
+            "Next Action Required": 16, "Category": 14,
+            "Total Open Period": 14, "Total Open Time (Days)": 14,
+            "Current Status Time (Days)": 16, "Reason for Change": 20,
+            "Query History": 60, "firstSeen": 12, "lastSeen": 12,
+        },
+        wrap_cols={"Query History"},
+        header_row=6,
+        row_font_fn=lambda doc: CELL_FONT,
+    )
+    # Přebarvení řádků dle DCR stavu
+    data_start = 7
+    for row_idx, doc in enumerate(docs_sorted, data_start):
+        fill = _dcr_row_fill(doc)
+        for col_idx in range(1, len(COLUMNS_ECG_DCRS) + 1):
+            ws.cell(row=row_idx, column=col_idx).fill = fill
+
+
+# ---------------------------------------------------------------------------
+# Helpers: výstupní cesta
+# ---------------------------------------------------------------------------
+
+def _unique_path(directory: Path, stem: str, suffix: str) -> Path:
+    candidate = directory / f"{stem}{suffix}"
+    if not candidate.exists():
+        return candidate
+    n = 2
+    while True:
+        candidate = directory / f"{stem} ({n}){suffix}"
+        if not candidate.exists():
+            return candidate
+        n += 1
+
+
+# ---------------------------------------------------------------------------
+# Timing helper
+# ---------------------------------------------------------------------------
+
+def _tick(label: str, t0: float) -> float:
+    """Vypíše dobu od t0 a vrátí aktuální čas jako nový t0."""
+    elapsed = time.perf_counter() - t0
+    print(f"  {label:<30} {elapsed:6.2f} s")
+    return time.perf_counter()
+
+
+# ---------------------------------------------------------------------------
+# Main
+# ---------------------------------------------------------------------------
+
+def main():
+    t_total = time.perf_counter()
+    print("Spouštím generování reportu...")
+    print()
+
+    # -- 1. MongoDB: připojení + načtení + seřazení --------------------------
+    t = time.perf_counter()
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    client.admin.command("ping")
+    db = client[DB_NAME]
+    score_docs = list(db["Clario.MayoScore"].find({}))
+    diary_docs = list(db["Clario.MayoDiary"].find({}))
+    ecoa_dcr_docs = list(db["Clario.eCOA_DCRs"].find({}))
+    ecg_dcr_docs  = list(db["Clario.ECG_DCRs"].find({}))
+    client.close()
+    score_docs.sort(key=_visit_sort_key)
+    diary_docs.sort(key=lambda d: (
+        d.get("subject", {}).get("id", ""),
+        d["fields"].get("Report Date", ""),
+    ))
+    t = _tick(f"MongoDB (ping, fetch, sort  →  {len(score_docs)} + {len(diary_docs)} + {len(ecoa_dcr_docs)} + {len(ecg_dcr_docs)} záznamů)", t)
+
+    # -- 2–4. Tvorba listů ---------------------------------------------------
+    wb = Workbook()
+    ws_score = wb.active
+    ws_score.title = "MayoScore"
+    build_mayo_score_sheet(ws_score, score_docs)
+    t = _tick("List MayoScore    (KLIKNI SEM, zebra, červené I-0, autofilter)", t)
+
+    ws_diary = wb.create_sheet("MayoDiary")
+    build_mayo_diary_sheet(ws_diary, diary_docs)
+    t = _tick("List MayoDiary    (zebra, formátování dat, autofilter)", t)
+
+    ws_days = wb.create_sheet("EligibleDays")
+    build_eligible_days_sheet(ws_days, score_docs, diary_docs)
+    t = _tick("List EligibleDays (diary lookup, included/excluded flag, autofilter)", t)
+
+    ws_ecoa = wb.create_sheet("eCOA_DCRs")
+    build_ecoa_dcrs_sheet(ws_ecoa, ecoa_dcr_docs)
+    t = _tick(f"List eCOA_DCRs    ({len(ecoa_dcr_docs)} záznamů)", t)
+
+    ws_ecg = wb.create_sheet("ECG_DCRs")
+    build_ecg_dcrs_sheet(ws_ecg, ecg_dcr_docs)
+    t = _tick(f"List ECG_DCRs     ({len(ecg_dcr_docs)} záznamů)", t)
+
+    # -- 5. Uložení XLSX -----------------------------------------------------
+    OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
+    today = datetime.now().strftime("%Y-%m-%d")
+    base_stem = f"{today} 77242113UCO3001 Clario Reports"
+    xlsm_path = _unique_path(OUTPUT_DIR, base_stem, ".xlsm")
+    xlsx_path = xlsm_path.with_suffix(".xlsx")
+    wb.save(str(xlsx_path))
+    t = _tick("Uložení XLSX      (openpyxl, dočasný soubor)", t)
+
+    # -- 6. Injektování VBA --------------------------------------------------
+    inject_vba(xlsx_path, xlsm_path)
+    xlsx_path.unlink(missing_ok=True)
+    _tick("Injektování VBA   (xlwings: open → AddFromString → SaveAs .xlsm)", t)
+
+    # -- Souhrn --------------------------------------------------------------
+    total = time.perf_counter() - t_total
+    print()
+    print(f"  {'Celkem':<30} {total:6.2f} s")
+    print()
+    print(f"Uloženo: {xlsm_path}")
+
+
+def inject_vba(xlsx_path: Path, xlsm_path: Path) -> None:
+    vba_code = '''\
+Private Sub Worksheet_SelectionChange(ByVal Target As Range)
+    If Target.Row < 2 Then Exit Sub
+    If Target.Rows.Count > 1 Then Exit Sub
+    If Target.Column <> 1 Then Exit Sub
+
+    Dim subjectId As String
+    Dim visit As String
+    subjectId = CStr(Me.Cells(Target.Row, 3).Value)
+    visit = CStr(Me.Cells(Target.Row, 4).Value)
+
+    If subjectId = "" Or visit = "" Then Exit Sub
+
+    Dim ws As Worksheet
+    On Error Resume Next
+    Set ws = ThisWorkbook.Sheets("EligibleDays")
+    On Error GoTo 0
+    If ws Is Nothing Then Exit Sub
+
+    Application.ScreenUpdating = False
+
+    ws.AutoFilterMode = False
+    ws.Range("A1").AutoFilter
+    ws.Range("A1").AutoFilter Field:=2, Criteria1:=subjectId
+    ws.Range("A1").AutoFilter Field:=3, Criteria1:=visit
+
+    ws.Activate
+    ws.Range("A2").Select
+
+    Application.ScreenUpdating = True
+End Sub
+'''
+
+    app = xw.App(visible=False)
+    try:
+        wb = app.books.open(str(xlsx_path))
+        # Najdi VBComponent odpovídající listu "MayoScore" podle tab názvu
+        vb_comp = None
+        for comp in wb.api.VBProject.VBComponents:
+            if comp.Type == 100:  # xlSheet
+                try:
+                    if comp.Properties("Name").Value == "MayoScore":
+                        vb_comp = comp
+                        break
+                except Exception:
+                    pass
+        if vb_comp is None:
+            # fallback: první sheet (Sheet1)
+            vb_comp = wb.api.VBProject.VBComponents("Sheet1")
+        vb_comp.CodeModule.AddFromString(vba_code)
+        wb.api.SaveAs(str(xlsm_path), FileFormat=52)  # 52 = xlOpenXMLWorkbookMacroEnabled
+        wb.close()
+    finally:
+        app.quit()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,293 @@
+# Název:  janssenpc_file_send.py
+# Verze:  2.2
+# Datum:  2026-06-02
+# Popis:  Přejmenuje soubory ve složce ##JNJPrenos, odešle je na msgs.buzalka.cz
+#         a přesune do podsložky Trash. Loguje průběh do file_send.log vedle skriptu.
+#         Podporuje: PANORAMA Site Contacts (xlsx), Panorama Dashboard (xlsx),
+#         Site Visit Report (xlsx), Follow-Up Letter (xlsx),
+#         Clario MayoScore (csv), Clario MayoDiary (csv),
+#         Clario Data Corrections / DCRs (csv).
+
+import os
+import time
+import shutil
+import requests
+import pandas as pd
+from pathlib import Path
+from datetime import datetime
+
+TOKEN = "13e1bb01-9fd5-44a8-8ce9-4ee27133d340"
+UPLOAD_URL = "https://msgs.buzalka.cz/upload-dropbox"
+SOURCE_DIR = Path(r"C:\Users\vbuzalka\OneDrive - JNJ\##JNJPrenos")
+TRASH_DIR = SOURCE_DIR / "Trash"
+LOG_FILE = Path(__file__).parent / "file_send.log"
+
+MAYO_DIARY_COLUMNS = [
+    'Protocol', 'Country', 'Site', 'PI Name', 'Subject ID',
+    'Report Date', 'Report Start Date/Time', 'Report End Date/Time',
+    'Stool Frequency', 'Form Number', 'Role', 'Original Source',
+]
+
+MAYO_SCORE_COLUMNS = [
+    'Protocol', 'Study Population', 'Country', 'Site', 'Principal Investigator',
+    'Participant ID', 'Baseline Stool Frequency', 'Visit', 'Visit Date',
+    'Endoscopy Completed?', 'Central Endoscopy Score', 'Local Endoscopy Score',
+    'Partial Mayo Score', 'Full Mayo Score',
+]
+
+DCR_ECOA_COLUMNS = [
+    'Protocol', 'Data Correction ID', 'Description', 'Query History',
+]
+
+DCR_ECG_COLUMNS = [
+    'Protocol', 'Data Correction ID', 'Site ID', 'PI_NAME', 'Subject Number', 'Query History',
+]
+
+PANORAMA_COLUMNS = [
+    'Part', 'Source', 'Sector', 'TA', 'Protocol ID', 'Interventional',
+    'Region', 'Country Name', 'Institution Name', 'Site City',
+    'Site Zip/Postal Code', 'Site Address', 'MSID', 'Site ID',
+    'Site Status', 'SM Full Name', 'PI Name', 'St F Subj Enr Act',
+    'ID', 'Category', 'Type', 'Priority', 'Severity', 'Description',
+    'Brief Description - Subject ID', 'Comments', 'Created By',
+    'Create Date', 'Last Modified Date', 'Start Date', 'Due Date',
+    'End Date', 'Status', 'Days Outstanding', 'Action Taken',
+    'Escalated To', 'Visit Report Status', 'Visit Report Approved',
+    'Visit Report Type', 'Visit Report Status End Date', 'Active',
+    'Association', 'Deviation', 'Deviation Closed Date', 'Reason For Exclusion'
+]
+
+
+def log(msg: str):
+    ts = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
+    line = f"[{ts}] {msg}"
+    print(line)
+    with LOG_FILE.open("a", encoding="utf-8") as lf:
+        lf.write(line + "\n")
+
+
+def move_to_trash(f: Path):
+    TRASH_DIR.mkdir(exist_ok=True)
+    dest = TRASH_DIR / f.name
+    if dest.exists():
+        ts = datetime.now().strftime('%Y%m%d_%H%M%S')
+        dest = TRASH_DIR / f"{f.stem}_{ts}{f.suffix}"
+    shutil.move(str(f), dest)
+
+
+def get_timestamp(file_path: str) -> str:
+    return datetime.fromtimestamp(os.path.getmtime(file_path)).strftime('%Y-%m-%d_%H-%M-%S')
+
+
+def prejmenuj(directory: Path) -> None:
+    log(f"--- Přejmenování, adresář: {directory} ---")
+    files = [f for f in directory.iterdir() if f.is_file()]
+    log(f"  Nalezeno souborů: {len(files)} — {[f.name for f in files]}")
+
+    for f in files:
+        filename = f.name
+        file_path = str(f)
+
+        # 0a. CLARIO MAYO DIARY (CSV)
+        if 'MAYO-DIARY' in filename and filename.endswith('.csv'):
+            log(f"  Detekován MayoDiary: {filename}")
+            try:
+                df = pd.read_csv(file_path)
+                missing = set(MAYO_DIARY_COLUMNS) - set(df.columns)
+                if not missing:
+                    protocols = df['Protocol'].dropna().unique()
+                    log(f"    Protocol: {list(protocols)}")
+                    if len(protocols) > 0:
+                        study = str(protocols[0]).strip()
+                        new_name = f"{get_timestamp(file_path)} {study} Clario MayoDiary.csv"
+                        f.rename(directory / new_name)
+                        log(f"    ÚSPĚCH: -> '{new_name}'")
+                    else:
+                        log(f"    VAROVÁNÍ: Sloupec Protocol je prázdný.")
+                else:
+                    log(f"    PŘESKOČENO: Chybí sloupce: {missing}")
+            except Exception as e:
+                log(f"    CHYBA: {e}")
+            continue
+
+        # 0b. CLARIO MAYO SCORE (CSV)
+        if 'Custom.MayoScoreReport' in filename and filename.endswith('.csv'):
+            log(f"  Detekován MayoScore: {filename}")
+            try:
+                df = pd.read_csv(file_path)
+                missing = set(MAYO_SCORE_COLUMNS) - set(df.columns)
+                if not missing:
+                    protocols = df['Protocol'].dropna().unique()
+                    log(f"    Protocol: {list(protocols)}")
+                    if len(protocols) > 0:
+                        study = str(protocols[0]).strip()
+                        new_name = f"{get_timestamp(file_path)} {study} Clario MayoScore.csv"
+                        f.rename(directory / new_name)
+                        log(f"    ÚSPĚCH: -> '{new_name}'")
+                    else:
+                        log(f"    VAROVÁNÍ: Sloupec Protocol je prázdný.")
+                else:
+                    log(f"    PŘESKOČENO: Chybí sloupce: {missing}")
+            except Exception as e:
+                log(f"    CHYBA: {e}")
+            continue
+
+        # 0c. CLARIO DATA CORRECTIONS (CSV) — ECG nebo eCOA
+        if filename.endswith('.csv'):
+            try:
+                df = pd.read_csv(file_path, nrows=2)
+                cols = set(df.columns)
+                log(f"  CSV sloupce ({filename}): {sorted(cols)}")
+
+                missing_ecg  = set(DCR_ECG_COLUMNS)  - cols
+                missing_ecoa = set(DCR_ECOA_COLUMNS) - cols
+                log(f"    Chybí pro ECG:  {missing_ecg or '—'}")
+                log(f"    Chybí pro eCOA: {missing_ecoa or '—'}")
+
+                if not missing_ecg:
+                    label = "Clario ECG DCRs"
+                elif not missing_ecoa:
+                    label = "Clario eCOA DCRs"
+                else:
+                    log(f"  Neznámý CSV typ — bude odeslán bez přejmenování: {filename}")
+                    # nepokračujeme continue — soubor projde dál k odeslání
+                    label = None
+
+                if label:
+                    log(f"  Detekován {label}: {filename}")
+                    protocols = df['Protocol'].dropna().unique()
+                    log(f"    Protocol: {list(protocols)}")
+                    if len(protocols) > 0:
+                        study = str(protocols[0]).strip()
+                        new_name = f"{get_timestamp(file_path)} {study} {label}.csv"
+                        f.rename(directory / new_name)
+                        log(f"    ÚSPĚCH přejmenování: -> '{new_name}'")
+                    else:
+                        log(f"    VAROVÁNÍ: Sloupec Protocol je prázdný — odesílám pod původním názvem.")
+            except Exception as e:
+                log(f"    CHYBA při zpracování CSV {filename}: {e}")
+            continue
+
+        # Ostatní — jen xlsx
+        if not filename.endswith('.xlsx'):
+            log(f"  Přeskočeno (neznámý typ): {filename}")
+            continue
+
+        # 1a. PANORAMA SITE CONTACTS (XLSX) — soubor pojmenovaný "PANORAMA Dashboard"
+        if 'PANORAMA Dashboard' in filename:
+            log(f"  Detekován PANORAMA Site Contacts: {filename}")
+            try:
+                with pd.ExcelFile(file_path) as xl:
+                    sheet_names = xl.sheet_names
+                    if 'Site Contacts' in sheet_names:
+                        df_a1 = xl.parse('Site Contacts', nrows=1, header=None)
+                        a1 = str(df_a1.iloc[0, 0]) if not df_a1.empty else ''
+                    else:
+                        a1 = None
+                # soubor je nyní zavřen — přejmenování proběhne bez chyby
+                if a1 is None:
+                    log(f"    PŘESKOČENO: List 'Site Contacts' nenalezen.")
+                elif 'Title: Site Contacts' in a1:
+                    new_name = f"{get_timestamp(file_path)} PANORAMA Site Contacts.xlsx"
+                    f.rename(directory / new_name)
+                    log(f"    ÚSPĚCH: -> '{new_name}'")
+                else:
+                    log(f"    PŘESKOČENO: A1 neodpovídá vzoru ({a1[:50]})")
+            except Exception as e:
+                log(f"    CHYBA: {e}")
+            continue
+
+        # 1. PANORAMA DASHBOARD (XLSX)
+        if 'Panorama Dashboard' in filename:
+            log(f"  Detekován Panorama: {filename}")
+            try:
+                df = pd.read_excel(file_path, skiprows=5)
+                missing = set(PANORAMA_COLUMNS) - set(df.columns)
+                if not missing:
+                    ids = df['Protocol ID'].dropna().unique()
+                    log(f"    Protocol ID: {list(ids)}")
+                    if len(ids) > 0:
+                        study = str(ids[0]).strip()
+                        new_name = f"{get_timestamp(file_path)} {study} Panorama Deviations and Issues.xlsx"
+                        f.rename(directory / new_name)
+                        log(f"    ÚSPĚCH: -> '{new_name}'")
+                    else:
+                        log(f"    VAROVÁNÍ: Protocol ID je prázdný.")
+                else:
+                    log(f"    PŘESKOČENO: Chybí sloupce: {missing}")
+            except Exception as e:
+                log(f"    CHYBA: {e}")
+            continue
+
+        # 2. SITE VISIT REPORT A FOLLOW-UP LETTER (XLSX)
+        try:
+            df_a1 = pd.read_excel(file_path, nrows=1, header=None)
+            if not df_a1.empty:
+                a1 = str(df_a1.iloc[0, 0])
+                log(f"  A1: {a1[:80]}")
+                is_site_visit = "Title: Site Visit Report Details" in a1
+                is_follow_up = "Title: Follow-Up Letter Details" in a1
+
+                if is_site_visit or is_follow_up:
+                    suffix = "Site Visit Details.xlsx" if is_site_visit else "FUL details.xlsx"
+                    log(f"  Detekován {'Site Visit' if is_site_visit else 'Follow-Up Letter'}: {filename}")
+                    df = pd.read_excel(file_path, skiprows=5)
+                    if 'Protocol ID' in df.columns:
+                        ids = df['Protocol ID'].dropna().unique()
+                        log(f"    Protocol ID: {list(ids)}")
+                        if len(ids) > 0:
+                            study = str(ids[0]).strip()
+                            new_name = f"{get_timestamp(file_path)} {study} {suffix}"
+                            f.rename(directory / new_name)
+                            log(f"    ÚSPĚCH: -> '{new_name}'")
+                        else:
+                            log(f"    VAROVÁNÍ: Protocol ID je prázdný.")
+                    else:
+                        log(f"    PŘESKOČENO: Chybí sloupec Protocol ID.")
+                else:
+                    log(f"  Přeskočeno (neznámý xlsx obsah): {filename}")
+        except Exception as e:
+            log(f"  CHYBA: {e}")
+
+    log("--- Přejmenování dokončeno ---")
+
+
+# === HLAVNÍ LOGIKA ===
+
+log("=== Spuštění ===")
+log(f"Zdrojový adresář: {SOURCE_DIR} (existuje: {SOURCE_DIR.exists()})")
+
+# 1. Přejmenuj
+prejmenuj(SOURCE_DIR)
+
+# 2. Počkej 10 vteřin
+log("Čekám 10 vteřin...")
+time.sleep(10)
+
+# 3. Odešli soubory
+files = [f for f in SOURCE_DIR.iterdir() if f.is_file()]
+log(f"Souborů k odeslání: {len(files)}")
+for f in files:
+    log(f"  Nalezen: {f.name}")
+
+if not files:
+    log("Žádné soubory k odeslání.")
+else:
+    for f in files:
+        try:
+            with f.open("rb") as fh:
+                resp = requests.post(
+                    UPLOAD_URL,
+                    headers={"Authorization": f"Bearer {TOKEN}"},
+                    files={"file": (f.name, fh, "application/octet-stream")},
+                    timeout=120,
+                )
+            resp.raise_for_status()
+            status = resp.json().get('status', '?').upper()
+            log(f"  {status:10} | {f.name}")
+            move_to_trash(f)
+            log(f"  PŘESUNUTO  | {f.name} -> Trash")
+        except Exception as e:
+            log(f"  CHYBA      | {f.name} | {e}")
+
+log("=== Hotovo ===")
@@ -0,0 +1,10 @@
+{
+  "pk": 3237,
+  "title": "Subject_Number_Creation",
+  "label": "Janssen 77242113UCO3001 Subject CZ100132003 has been created in IRT at site DD5-CZ10013",
+  "event": "Create",
+  "actual_date": "2026-05-06",
+  "subject": "CZ100132003",
+  "study": "77242113UCO3001",
+  "text": "77242113UCO3001\n\nJanssen Pharmaceuticals\nhttps://janssen.4gclinical.com\n\nSubject  CZ100132003  has been created in IRT.\n\nSite Details\n\nLocation: CZE\n\nSite: DD5-CZ10013\n\nInvestigator: David Stepek\n\nSubject Details\n\nSubject: CZ100132003\n\nIRT Subject Status: Screened\n\nRescreened Subject: No\n\nCohort: Adult subjects (18 years or older)\n\nInformed Consent Date at Subject Creation: 06-May-2026\n\n ADT-IR: No\n\n 3 or More Advanced Therapies: No\n\n Ustekinumab: No\n\n Only Oral 5-ASA Compounds: No\n\nDate of Subject Creation in IRT: 06-May-2026\n\nTransaction Date/Time (site local): 06-May-2026 10:33:13\n\nTransaction Date/Time (system local): 06-May-2026 08:33:13\n\nTransaction performed by: dstepek@vnbrno.cz\n\nIf you have questions about this notification, please contact 4G Clinical Support at http://support.4gclinical.com"
+}
@@ -0,0 +1,10 @@
+{
+  "pk": 3510,
+  "title": "Subject_Number_Creation",
+  "label": "Janssen 77242113UCO3001 Subject CZ100032001 has been created in IRT at site DD5-CZ10003",
+  "event": "Create",
+  "actual_date": "2026-05-13",
+  "subject": "CZ100032001",
+  "study": "77242113UCO3001",
+  "text": "77242113UCO3001\n\nJanssen Pharmaceuticals\nhttps://janssen.4gclinical.com\n\nSubject  CZ100032001  has been created in IRT.\n\nSite Details\n\nLocation: CZE\n\nSite: DD5-CZ10003\n\nInvestigator: Leksa Vaclav\n\nSubject Details\n\nSubject: CZ100032001\n\nIRT Subject Status: Screened\n\nRescreened Subject: No\n\nCohort: Adult subjects (18 years or older)\n\nInformed Consent Date at Subject Creation: 13-May-2026\n\n ADT-IR: No\n\n 3 or More Advanced Therapies: No\n\n Ustekinumab: No\n\n Only Oral 5-ASA Compounds: No\n\nDate of Subject Creation in IRT: 13-May-2026\n\nTransaction Date/Time (site local): 13-May-2026 07:44:11\n\nTransaction Date/Time (system local): 13-May-2026 05:44:11\n\nTransaction performed by: vaclav.leksa@seznam.cz\n\nIf you have questions about this notification, please contact 4G Clinical Support at http://support.4gclinical.com"
+}
@@ -0,0 +1,10 @@
+{
+  "pk": 4231,
+  "title": "Subject_Number_Creation",
+  "label": "Janssen 77242113UCO3001 Subject CZ100162002 has been created in IRT at site DD5-CZ10016",
+  "event": "Create",
+  "actual_date": "2026-05-27",
+  "subject": "CZ100162002",
+  "study": "77242113UCO3001",
+  "text": "77242113UCO3001\n\nJanssen Pharmaceuticals\nhttps://janssen.4gclinical.com\n\nSubject  CZ100162002  has been created in IRT.\n\nSite Details\n\nLocation: CZE\n\nSite: DD5-CZ10016\n\nInvestigator: Robert Mudr\n\nSubject Details\n\nSubject: CZ100162002\n\nIRT Subject Status: Screened\n\nRescreened Subject: No\n\nCohort: Adult subjects (18 years or older)\n\nInformed Consent Date at Subject Creation: 27-May-2026\n\n ADT-IR: Yes\n\n 3 or More Advanced Therapies: No\n\n Ustekinumab: No\n\n Only Oral 5-ASA Compounds: No\n\nDate of Subject Creation in IRT: 27-May-2026\n\nTransaction Date/Time (site local): 27-May-2026 11:55:28\n\nTransaction Date/Time (system local): 27-May-2026 09:55:28\n\nTransaction performed by: petr.pekny@nmskb.cz\n\nIf you have questions about this notification, please contact 4G Clinical Support at http://support.4gclinical.com"
+}
@@ -0,0 +1,10 @@
+{
+  "pk": 4271,
+  "title": "Subject_Number_Creation",
+  "label": "Janssen 77242113UCO3001 Subject CZ100012004 has been created in IRT at site DD5-CZ10001",
+  "event": "Create",
+  "actual_date": "2026-05-28",
+  "subject": "CZ100012004",
+  "study": "77242113UCO3001",
+  "text": "77242113UCO3001\n\nJanssen Pharmaceuticals\nhttps://janssen.4gclinical.com\n\nSubject  CZ100012004  has been created in IRT.\n\nSite Details\n\nLocation: CZE\n\nSite: DD5-CZ10001\n\nInvestigator: Matej Falc\n\nSubject Details\n\nSubject: CZ100012004\n\nIRT Subject Status: Screened\n\nRescreened Subject: No\n\nCohort: Adult subjects (18 years or older)\n\nInformed Consent Date at Subject Creation: 28-May-2026\n\n ADT-IR: No\n\n 3 or More Advanced Therapies: No\n\n Ustekinumab: No\n\n Only Oral 5-ASA Compounds: No\n\nDate of Subject Creation in IRT: 28-May-2026\n\nTransaction Date/Time (site local): 28-May-2026 07:14:21\n\nTransaction Date/Time (system local): 28-May-2026 05:14:21\n\nTransaction performed by: matesfalc@seznam.cz\n\nIf you have questions about this notification, please contact 4G Clinical Support at http://support.4gclinical.com"
+}
@@ -0,0 +1,10 @@
+{
+  "pk": 4461,
+  "title": "Randomized",
+  "label": "Janssen 77242113UCO3001 Subject randomized CZ100132003 at site DD5-CZ10013",
+  "event": "I0",
+  "actual_date": "2026-06-02",
+  "subject": "CZ100132003",
+  "study": "77242113UCO3001",
+  "text": "77242113UCO3001\n\nJanssen Pharmaceuticals\nhttps://janssen.4gclinical.com\n\nSubject CZ100132003 has been randomized.\n\n    The following medication(s) has been assigned to the subject:\n\n    \n    \n        Medication No\n        Medication Type\n        Packaged Lot No\n        Expiration Date\n    \n    \n        \n            1056513\n            Icotrokinra 320mg / placebo\n            4393030\n            19-Jan-2027\n        \n    \n    \n\nSite Details\n\nLocation: CZE\n\nSite: DD5-CZ10013\n\nInvestigator: David Stepek\n\nSubject Details\n\nSubject: CZ100132003\n\nIRT Subject Status: Randomized\n\nCohort: Adult subjects (18 years or older)\n\n ADT-IR: No\n\n 3 or More Advanced Therapies: No\n\n Ustekinumab: No\n\n Only Oral 5-ASA Compounds: No\n \n Isolated Proctitis: No\n\nTransaction Date/Time (site local): 02-Jun-2026 08:19:11\n\nTransaction Date/Time (system local): 02-Jun-2026 06:19:11\n\nTransaction performed by: dstepek@vnbrno.cz\n\nIf you have questions about this notification, please contact 4G Clinical Support at http://support.4gclinical.com"
+}
@@ -0,0 +1,449 @@
+"""
+download_attachments_v1.0.py
+Nazev:  download_attachments_v1.0.py
+Verze:  1.0
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Stahuje skutecne prilohy (is_inline=False) vsech emailu z MongoDB kolekce
+    ordinace@buzalkova.cz primo pres Microsoft Graph API a uklada je do
+    adresare /mnt/Emails/ordinace@buzalkova.cz/Attachments/.
+
+    Deduplikace podle SHA256 hashe obsahu:
+        - stejny hash = soubor uz existuje -> preskoci
+        - prvni vyskytu souboru: ulozi pod puvodnimnazvem
+        - kolize nazvu (stejny nazev, jiny hash): faktura_2.pdf, faktura_3.pdf ...
+
+    Po ulozeni aktualizuje MongoDB:
+        - v email dokumentu: kazda priloha dostane file_hash + local_path
+        - kolekce emaily.attachments_index: _id=hash, filename, path, size_bytes,
+          mime_type, first_seen_at, ref_count (pocet emailu ktery ji obsahuje)
+
+    Bezpecne prerusit a opakovat:
+        - zpravy kde jsou vsechny prilohy uz stazene (maji file_hash) se preskoci
+        - --force-recheck znovu overi i uz stazene (pro pripad zmen na disku)
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python download_attachments_v1.0.py               # stahni vse co chybi
+    python download_attachments_v1.0.py --limit 50    # test na prvnich 50 emailech
+    python download_attachments_v1.0.py --force-recheck  # overi i uz stazene
+
+Docker (po pridani mountu /mnt/user/Emails -> /mnt/Emails):
+    docker exec -it python-runner python /scripts/download_attachments_v1.0.py
+
+Zavislosti:
+    msal, requests, pymongo, python-dateutil
+    Python 3.10+
+
+Struktura na disku:
+    /mnt/Emails/
+    └── ordinace@buzalkova.cz/
+        └── Attachments/
+            ├── faktura_2026.pdf
+            ├── vysledky_lab.pdf
+            ├── vysledky_lab_2.pdf   <- kolize nazvu, jiny obsah
+            └── ...
+
+Kolekce emaily.attachments_index:
+    _id          SHA256 hash (hex)
+    filename     nazev souboru na disku (prvni vyskytu)
+    local_path   relativni cesta od Attachments/ (zatim = filename)
+    size_bytes   velikost souboru
+    mime_type    MIME typ
+    first_seen_at  datetime UTC
+    ref_count    v kolika emailech se tato priloha vyskytuje
+
+Aktualizace v email dokumentu (kolekce ordinace@buzalkova.cz):
+    attachments[i].file_hash    SHA256 hash
+    attachments[i].local_path   cesta relativni od Attachments/
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+"""
+
+import sys
+import hashlib
+import logging
+import argparse
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from pymongo import MongoClient, UpdateOne
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_MAILBOX       = "ordinace@buzalkova.cz"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI           = "mongodb://192.168.1.76:27017"
+MONGO_DB            = "emaily"
+MONGO_COL_EMAILS    = "ordinace@buzalkova.cz"
+MONGO_COL_INDEX     = "attachments_index"
+
+ATTACHMENTS_DIR     = Path("/mnt/Emails/ordinace@buzalkova.cz/Attachments")
+LOG_FILE            = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION      = "1.0"
+BATCH_SIZE          = 50
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+_graph_token: Optional[str] = None
+
+
+# ─── Graph API ────────────────────────────────────────────────────────────────
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get_bytes(url: str) -> bytes:
+    """Stahne binarni obsah prilohy."""
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, timeout=120, stream=True)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.content
+    raise RuntimeError(f"Graph GET bytes failed: {url}")
+
+
+def graph_get_json(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET json failed: {url}")
+
+
+def fetch_attachment_content(graph_message_id: str, attachment_id: str) -> Optional[bytes]:
+    """Stahne obsah prilohy pres Graph API."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/messages/{graph_message_id}/attachments/{attachment_id}/$value"
+    try:
+        return graph_get_bytes(url)
+    except Exception as e:
+        logging.error("fetch_attachment_content failed [msg=%s att=%s]: %s", graph_message_id, attachment_id, e)
+        return None
+
+
+def fetch_message_attachments(graph_message_id: str) -> list[dict]:
+    """Nacte seznam priloh zpravy z Graph API (metadata vcetne attachment ID)."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/messages/{graph_message_id}/attachments"
+    try:
+        data = graph_get_json(url, {"$select": "id,name,contentType,size,isInline,contentId"})
+        return data.get("value", [])
+    except Exception as e:
+        logging.error("fetch_message_attachments failed [%s]: %s", graph_message_id, e)
+        return []
+
+
+# ─── Dedup + ukládání ─────────────────────────────────────────────────────────
+
+def sha256(data: bytes) -> str:
+    return hashlib.sha256(data).hexdigest()
+
+
+def resolve_filename(desired_name: str, att_dir: Path, hash_val: str, index_col) -> str:
+    """
+    Vrati nazev souboru ktery pouzit pro ulozeni.
+    Pokud desired_name jiz existuje s jinym hashem, prida suffix _2, _3 ...
+    """
+    # Zkontroluj jestli existujici soubor se stejnym nazvem ma stejny hash
+    existing = index_col.find_one({"filename": desired_name})
+    if existing:
+        if existing["_id"] == hash_val:
+            return desired_name  # Stejny hash, stejne jmeno — dedup hit
+        # Jiny hash — hledej volny suffix
+        stem   = Path(desired_name).stem
+        suffix = Path(desired_name).suffix
+        n = 2
+        while True:
+            candidate = f"{stem}_{n}{suffix}"
+            if not (att_dir / candidate).exists():
+                # Overi ze ani v indexu neni tento kandidat s jinym hashem
+                ex2 = index_col.find_one({"filename": candidate})
+                if not ex2 or ex2["_id"] == hash_val:
+                    return candidate
+            n += 1
+    return desired_name
+
+
+def save_attachment(content: bytes, original_name: str, att_dir: Path, index_col) -> tuple[str, str, bool]:
+    """
+    Ulozi prilohu s deduplikaci.
+    Vraci (hash, local_path, was_new):
+        was_new=True  -> soubor byl ulozen
+        was_new=False -> hash uz existoval, soubor preskocen
+    """
+    hash_val = sha256(content)
+
+    # Zkontroluj index — pokud hash uz existuje, vrat existujici zaznam
+    existing = index_col.find_one({"_id": hash_val})
+    if existing:
+        # Zvys pocitadlo referenci
+        index_col.update_one({"_id": hash_val}, {"$inc": {"ref_count": 1}})
+        return hash_val, existing["local_path"], False
+
+    # Novy soubor — urcit nazev
+    safe_name = "".join(c if c.isalnum() or c in "._- " else "_" for c in original_name).strip()
+    if not safe_name:
+        safe_name = f"attachment_{hash_val[:8]}"
+
+    filename  = resolve_filename(safe_name, att_dir, hash_val, index_col)
+    file_path = att_dir / filename
+
+    # Uloz soubor
+    file_path.write_bytes(content)
+
+    # Zaznamenej do indexu
+    index_col.insert_one({
+        "_id":          hash_val,
+        "filename":     filename,
+        "local_path":   filename,
+        "size_bytes":   len(content),
+        "mime_type":    "",
+        "first_seen_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        "ref_count":    1,
+    })
+
+    return hash_val, filename, True
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"download_attachments v{SCRIPT_VERSION}")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N emailu (0 = vse)")
+    ap.add_argument("--force-recheck", action="store_true",
+                    help="Znovu overi i emaily kde prilohy uz maji file_hash")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    start = datetime.now()
+    print(f"=== download_attachments v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {GRAPH_MAILBOX}")
+    print(f"Cilovy adresar: {ATTACHMENTS_DIR}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}")
+
+    # Adresar
+    ATTACHMENTS_DIR.mkdir(parents=True, exist_ok=True)
+    print(f"  Adresar OK")
+
+    # Graph
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    # MongoDB
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+
+    col_emails = client[MONGO_DB][MONGO_COL_EMAILS]
+    col_index  = client[MONGO_DB][MONGO_COL_INDEX]
+
+    # Indexy na attachment index kolekci
+    if not args.no_indexes:
+        col_index.create_index("filename")
+        col_index.create_index("mime_type")
+
+    # Dotaz — emaily s prilohou ktere jeste nebyly zpracovany
+    if args.force_recheck:
+        query = {"has_attachments": True}
+    else:
+        query = {
+            "has_attachments": True,
+            "attachments": {
+                "$elemMatch": {
+                    "is_inline": False,
+                    "file_hash":  {"$exists": False},
+                }
+            }
+        }
+
+    total = col_emails.count_documents(query)
+    print(f"\nEmailu ke zpracovani: {total}")
+    if total == 0:
+        print("Neni co stahnout.")
+        client.close()
+        return
+
+    cursor = col_emails.find(query, {"_id": 1, "graph_id": 1, "subject": 1, "attachments": 1})
+    if args.limit:
+        cursor = cursor.limit(args.limit)
+
+    ok_count   = 0
+    new_count  = 0
+    skip_count = 0
+    err_count  = 0
+    email_i    = 0
+    batch      = []
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col_emails.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    for email_doc in cursor:
+        email_i += 1
+        email_id   = email_doc["_id"]
+        graph_id   = email_doc.get("graph_id", "")
+        subject    = (email_doc.get("subject") or "")[:60]
+        att_list   = email_doc.get("attachments") or []
+
+        # Jen skutecne prilohy
+        real_atts = [a for a in att_list if not a.get("is_inline", False)]
+        if not real_atts:
+            continue
+
+        print(f"\n  {email_i:>5}/{total}  {subject}")
+
+        # Nacti attachment IDs z Graph API
+        graph_atts = fetch_message_attachments(graph_id)
+        graph_att_map = {a["name"]: a for a in graph_atts if not a.get("isInline", False)}
+
+        updated_atts = list(att_list)
+        email_ok = True
+
+        for i, att in enumerate(updated_atts):
+            if att.get("is_inline", False):
+                continue
+            if not args.force_recheck and att.get("file_hash"):
+                skip_count += 1
+                print(f"         SKIP  {att['filename']}")
+                continue
+
+            att_name    = att.get("filename", "")
+            graph_att   = graph_att_map.get(att_name)
+
+            if not graph_att:
+                # Zkus najit podle casti nazvu
+                for gname, ga in graph_att_map.items():
+                    if att_name.lower() in gname.lower():
+                        graph_att = ga
+                        break
+
+            if not graph_att:
+                logging.error("attachment not found in Graph [email=%s att=%s]", email_id, att_name)
+                print(f"         ERR   {att_name} (nenalezeno v Graph)")
+                err_count += 1
+                email_ok = False
+                continue
+
+            # Stahni obsah
+            content = fetch_attachment_content(graph_id, graph_att["id"])
+            if content is None:
+                err_count += 1
+                email_ok = False
+                print(f"         ERR   {att_name} (stazeni selhalo)")
+                continue
+
+            # Uloz s dedupem
+            hash_val, local_path, was_new = save_attachment(content, att_name, ATTACHMENTS_DIR, col_index)
+
+            # Aktualizuj MIME typ v indexu
+            col_index.update_one(
+                {"_id": hash_val},
+                {"$set": {"mime_type": att.get("mime_type", graph_att.get("contentType", ""))}},
+            )
+
+            # Zaznamenej do emailu
+            updated_atts[i] = {**att, "file_hash": hash_val, "local_path": local_path}
+
+            if was_new:
+                new_count += 1
+                print(f"         NEW   {local_path}  ({len(content):,} B)")
+            else:
+                skip_count += 1
+                print(f"         DUP   {att_name} -> {local_path}")
+
+        if email_ok:
+            ok_count += 1
+
+        # Uloz aktualizovane prilohy zpet do emailu
+        batch.append(UpdateOne(
+            {"_id": email_id},
+            {"$set": {"attachments": updated_atts}}
+        ))
+
+        if len(batch) >= BATCH_SIZE:
+            flush()
+
+        if email_i % 100 == 0:
+            elapsed = (datetime.now() - start).total_seconds()
+            print(f"  {'─'*60}")
+            print(f"  Průběh: emaily={email_i}/{total}  nove={new_count}  dup={skip_count}  err={err_count}")
+            print(f"  {'─'*60}")
+
+    flush()
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    files_total   = col_index.count_documents({})
+    size_total    = sum(d.get("size_bytes", 0) for d in col_index.find({}, {"size_bytes": 1}))
+
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  emaily={ok_count}  |  nove soubory={new_count}  |  duplikaty={skip_count}  |  err={err_count}")
+    print(f"Souboru v indexu: {files_total}  ({size_total/1024/1024:.1f} MB)")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,428 @@
+"""
+download_attachments_v1.1.py
+Nazev:  download_attachments_v1.1.py
+Verze:  1.1
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Stahuje skutecne prilohy (is_inline=False) vsech emailu z MongoDB
+    pres Microsoft Graph API a uklada je do adresare
+    /mnt/Emails/<schránka>/Attachments/.
+
+    Schránka se predava jako povinny parametr --mailbox.
+
+    Deduplikace podle SHA256 hashe obsahu:
+        - stejny hash = soubor uz existuje -> preskoci
+        - prvni vyskytu souboru: ulozi pod puvodnimnazvem
+        - kolize nazvu (stejny nazev, jiny hash): faktura_2.pdf, faktura_3.pdf ...
+
+    Po ulozeni aktualizuje MongoDB:
+        - v email dokumentu: kazda priloha dostane file_hash + local_path
+        - kolekce emaily.attachments_index: _id=hash, filename, path, size_bytes,
+          mime_type, mailbox, first_seen_at, ref_count
+
+    Bezpecne prerusit a opakovat — emaily kde vsechny prilohy maji file_hash
+    se preskoci. --force-recheck znovu overi i uz stazene.
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python download_attachments_v1.1.py --mailbox ordinace@buzalkova.cz
+    python download_attachments_v1.1.py --mailbox vladimir.buzalka@buzalka.cz --limit 50
+    python download_attachments_v1.1.py --mailbox ordinace@buzalkova.cz --force-recheck
+
+Docker:
+    docker exec -it python-runner python /scripts/download_attachments_v1.1.py \\
+        --mailbox ordinace@buzalkova.cz
+
+Zavislosti:
+    msal, requests, pymongo
+    Python 3.10+
+
+Struktura na disku:
+    /mnt/Emails/
+    └── <mailbox>/
+        └── Attachments/
+            ├── faktura_2026.pdf
+            ├── vysledky_lab.pdf
+            ├── vysledky_lab_2.pdf
+            └── ...
+
+Kolekce emaily.attachments_index:
+    _id            SHA256 hash (hex)
+    filename       nazev souboru na disku
+    local_path     relativni cesta od Attachments/
+    size_bytes     velikost souboru
+    mime_type      MIME typ
+    mailbox        schránka ze ktere pochazi prvni vyskytu
+    first_seen_at  datetime UTC
+    ref_count      v kolika emailech se tato priloha vyskytuje
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+    1.1  2026-06-02  Schránka jako parametr --mailbox (univerzalni pouziti)
+"""
+
+import sys
+import hashlib
+import logging
+import argparse
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from pymongo import MongoClient, UpdateOne
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI           = "mongodb://192.168.1.76:27017"
+MONGO_DB            = "emaily"
+MONGO_COL_INDEX     = "attachments_index"
+
+EMAILS_BASE_DIR     = Path("/mnt/Emails")
+LOG_FILE            = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION      = "1.1"
+BATCH_SIZE          = 50
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+_graph_token: Optional[str] = None
+
+
+# ─── Graph API ────────────────────────────────────────────────────────────────
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get_bytes(url: str) -> bytes:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, timeout=120, stream=True)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.content
+    raise RuntimeError(f"Graph GET bytes failed: {url}")
+
+
+def graph_get_json(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET json failed: {url}")
+
+
+def fetch_message_attachments(mailbox: str, graph_message_id: str) -> list[dict]:
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments"
+    try:
+        data = graph_get_json(url, {"$select": "id,name,contentType,size,isInline,contentId"})
+        return data.get("value", [])
+    except Exception as e:
+        logging.error("fetch_message_attachments failed [%s]: %s", graph_message_id, e)
+        return []
+
+
+def fetch_attachment_content(mailbox: str, graph_message_id: str, attachment_id: str) -> Optional[bytes]:
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments/{attachment_id}/$value"
+    try:
+        return graph_get_bytes(url)
+    except Exception as e:
+        logging.error("fetch_attachment_content failed [msg=%s att=%s]: %s", graph_message_id, attachment_id, e)
+        return None
+
+
+# ─── Dedup + ukládání ─────────────────────────────────────────────────────────
+
+def sha256(data: bytes) -> str:
+    return hashlib.sha256(data).hexdigest()
+
+
+def safe_filename(name: str) -> str:
+    safe = "".join(c if c.isalnum() or c in "._- " else "_" for c in name).strip()
+    return safe or "attachment"
+
+
+def resolve_filename(desired_name: str, att_dir: Path, hash_val: str, col_index) -> str:
+    """Vrati nazev souboru pro ulozeni — resi kolize (stejny nazev, jiny hash)."""
+    existing = col_index.find_one({"filename": desired_name})
+    if existing:
+        if existing["_id"] == hash_val:
+            return desired_name  # Dedup hit — stejny hash
+        # Kolize — hledej volny suffix
+        stem   = Path(desired_name).stem
+        suffix = Path(desired_name).suffix
+        n = 2
+        while True:
+            candidate = f"{stem}_{n}{suffix}"
+            ex2 = col_index.find_one({"filename": candidate})
+            if not ex2 or ex2["_id"] == hash_val:
+                if not (att_dir / candidate).exists() or (ex2 and ex2["_id"] == hash_val):
+                    return candidate
+            n += 1
+    return desired_name
+
+
+def save_attachment(
+    content: bytes,
+    original_name: str,
+    mime_type: str,
+    mailbox: str,
+    att_dir: Path,
+    col_index,
+) -> tuple[str, str, bool]:
+    """
+    Ulozi prilohu s deduplikaci.
+    Vraci (hash, local_path, was_new).
+    """
+    hash_val = sha256(content)
+
+    existing = col_index.find_one({"_id": hash_val})
+    if existing:
+        col_index.update_one({"_id": hash_val}, {"$inc": {"ref_count": 1}})
+        return hash_val, existing["local_path"], False
+
+    filename  = resolve_filename(safe_filename(original_name), att_dir, hash_val, col_index)
+    file_path = att_dir / filename
+    file_path.write_bytes(content)
+
+    col_index.insert_one({
+        "_id":          hash_val,
+        "filename":     filename,
+        "local_path":   filename,
+        "size_bytes":   len(content),
+        "mime_type":    mime_type,
+        "mailbox":      mailbox,
+        "first_seen_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        "ref_count":    1,
+    })
+
+    return hash_val, filename, True
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"download_attachments v{SCRIPT_VERSION}")
+    ap.add_argument("--mailbox",       required=True,
+                    help="Emailova schranka (napr. ordinace@buzalkova.cz)")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N emailu (0 = vse)")
+    ap.add_argument("--force-recheck", action="store_true",
+                    help="Znovu overi i emaily kde prilohy uz maji file_hash")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na attachments_index kolekci")
+    args = ap.parse_args()
+
+    mailbox     = args.mailbox
+    att_dir     = EMAILS_BASE_DIR / mailbox / "Attachments"
+    mongo_col   = mailbox
+
+    start = datetime.now()
+    print(f"=== download_attachments v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {mailbox}")
+    print(f"Cilovy adresar: {att_dir}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{mongo_col}")
+
+    att_dir.mkdir(parents=True, exist_ok=True)
+    print("  Adresar OK")
+
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+
+    col_emails = client[MONGO_DB][mongo_col]
+    col_index  = client[MONGO_DB][MONGO_COL_INDEX]
+
+    if not args.no_indexes:
+        col_index.create_index("filename")
+        col_index.create_index("mime_type")
+        col_index.create_index("mailbox")
+
+    # Dotaz
+    if args.force_recheck:
+        query = {"has_attachments": True}
+    else:
+        query = {
+            "has_attachments": True,
+            "attachments": {
+                "$elemMatch": {
+                    "is_inline": False,
+                    "file_hash": {"$exists": False},
+                }
+            }
+        }
+
+    total = col_emails.count_documents(query)
+    print(f"\nEmailu ke zpracovani: {total}")
+    if total == 0:
+        print("Neni co stahnout.")
+        client.close()
+        return
+
+    cursor = col_emails.find(query, {"_id": 1, "graph_id": 1, "subject": 1, "attachments": 1})
+    if args.limit:
+        cursor = cursor.limit(args.limit)
+
+    ok_count   = 0
+    new_count  = 0
+    dup_count  = 0
+    err_count  = 0
+    email_i    = 0
+    batch      = []
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col_emails.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    for email_doc in cursor:
+        email_i   += 1
+        email_id   = email_doc["_id"]
+        graph_id   = email_doc.get("graph_id", "")
+        subject    = (email_doc.get("subject") or "")[:60]
+        att_list   = email_doc.get("attachments") or []
+
+        real_atts = [a for a in att_list if not a.get("is_inline", False)]
+        if not real_atts:
+            continue
+
+        print(f"\n  {email_i:>5}/{total}  {subject}")
+
+        graph_atts    = fetch_message_attachments(mailbox, graph_id)
+        graph_att_map = {a["name"]: a for a in graph_atts if not a.get("isInline", False)}
+
+        updated_atts = list(att_list)
+        email_ok     = True
+
+        for i, att in enumerate(updated_atts):
+            if att.get("is_inline", False):
+                continue
+            if not args.force_recheck and att.get("file_hash"):
+                print(f"         SKIP  {att['filename']}")
+                continue
+
+            att_name  = att.get("filename", "")
+            graph_att = graph_att_map.get(att_name)
+            if not graph_att:
+                for gname, ga in graph_att_map.items():
+                    if att_name.lower() in gname.lower():
+                        graph_att = ga
+                        break
+
+            if not graph_att:
+                logging.error("attachment not found in Graph [email=%s att=%s]", email_id, att_name)
+                print(f"         ERR   {att_name} (nenalezeno v Graph)")
+                err_count += 1
+                email_ok = False
+                continue
+
+            content = fetch_attachment_content(mailbox, graph_id, graph_att["id"])
+            if content is None:
+                err_count += 1
+                email_ok = False
+                print(f"         ERR   {att_name} (stazeni selhalo)")
+                continue
+
+            mime_type = att.get("mime_type") or graph_att.get("contentType", "")
+            hash_val, local_path, was_new = save_attachment(
+                content, att_name, mime_type, mailbox, att_dir, col_index
+            )
+
+            updated_atts[i] = {**att, "file_hash": hash_val, "local_path": local_path}
+
+            if was_new:
+                new_count += 1
+                print(f"         NEW   {local_path}  ({len(content):,} B)")
+            else:
+                dup_count += 1
+                print(f"         DUP   {att_name} -> {local_path}")
+
+        if email_ok:
+            ok_count += 1
+
+        batch.append(UpdateOne({"_id": email_id}, {"$set": {"attachments": updated_atts}}))
+
+        if len(batch) >= BATCH_SIZE:
+            flush()
+
+        if email_i % 100 == 0:
+            elapsed = (datetime.now() - start).total_seconds()
+            print(f"  {'─'*60}")
+            print(f"  Průběh: emaily={email_i}/{total}  nove={new_count}  dup={dup_count}  err={err_count}")
+            print(f"  {'─'*60}")
+
+    flush()
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    files_total   = col_index.count_documents({})
+    size_total    = sum(d.get("size_bytes", 0) for d in col_index.find({}, {"size_bytes": 1}))
+
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  emaily={ok_count}  |  nove={new_count}  |  dup={dup_count}  |  err={err_count}")
+    print(f"Souboru v indexu: {files_total}  ({size_total / 1024 / 1024:.1f} MB)")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,466 @@
+"""
+download_attachments_v1.2.py
+Nazev:  download_attachments_v1.2.py
+Verze:  1.2
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Stahuje skutecne prilohy (is_inline=False) vsech emailu z MongoDB
+    pres Microsoft Graph API a uklada je do adresare
+    /mnt/Emails/<schránka>/Attachments/.
+
+    Schránka se predava jako povinny parametr --mailbox.
+
+    Deduplikace podle SHA256 hashe obsahu:
+        - stejny hash = soubor uz existuje -> preskoci
+        - prvni vyskytu souboru: ulozi pod puvodnimnazvem
+        - kolize nazvu (stejny nazev, jiny hash): faktura_2.pdf, faktura_3.pdf ...
+
+    Po ulozeni aktualizuje MongoDB:
+        - v email dokumentu: kazda priloha dostane file_hash + local_path
+        - kolekce emaily.attachments_index: _id=hash, filename, path, size_bytes,
+          mime_type, mailbox, first_seen_at, ref_count
+
+    Bezpecne prerusit a opakovat — emaily kde vsechny prilohy maji file_hash
+    se preskoci. --force-recheck znovu overi i uz stazene.
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python download_attachments_v1.2.py --mailbox ordinace@buzalkova.cz
+    python download_attachments_v1.2.py --mailbox ordinace@buzalkova.cz --limit 50
+    python download_attachments_v1.2.py --mailbox ordinace@buzalkova.cz --force-recheck
+
+Docker:
+    docker exec -it python-runner python /scripts/download_attachments_v1.2.py \\
+        --mailbox ordinace@buzalkova.cz
+
+Zavislosti:
+    msal, requests, pymongo
+    Python 3.10+
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+    1.1  2026-06-02  Schránka jako parametr --mailbox
+    1.2  2026-06-02  Oprava: Graph attachment mapa vcetne inline (fix ERR pri
+                     inline obrazcich ulozených jako is_inline=False v MongoDB);
+                     normalizace nazvu pro robustni porovnani; preskoceni S/MIME
+                     (.p7m/.p7s); pokud Graph oznaci jako inline -> SKIP ne ERR
+"""
+
+import sys
+import re
+import hashlib
+import logging
+import argparse
+import unicodedata
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from pymongo import MongoClient, UpdateOne
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI           = "mongodb://192.168.1.76:27017"
+MONGO_DB            = "emaily"
+MONGO_COL_INDEX     = "attachments_index"
+
+EMAILS_BASE_DIR     = Path("/mnt/Emails")
+LOG_FILE            = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION      = "1.2"
+BATCH_SIZE          = 50
+
+# Typy příloh které přeskočíme (S/MIME podpisy, certifikáty)
+SKIP_EXTENSIONS = {".p7m", ".p7s", ".p7c", ".p7b"}
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+_graph_token: Optional[str] = None
+
+
+# ─── Graph API ────────────────────────────────────────────────────────────────
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get_bytes(url: str) -> bytes:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, timeout=120, stream=True)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.content
+    raise RuntimeError(f"Graph GET bytes failed: {url}")
+
+
+def graph_get_json(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET json failed: {url}")
+
+
+def fetch_message_attachments(mailbox: str, graph_message_id: str) -> list[dict]:
+    """Nacte VSECHNY prilohy zpravy (vcetne inline) — filtrovani az pozdeji."""
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments"
+    try:
+        data = graph_get_json(url, {"$select": "id,name,contentType,size,isInline,contentId"})
+        return data.get("value", [])
+    except Exception as e:
+        logging.error("fetch_message_attachments failed [%s]: %s", graph_message_id, e)
+        return []
+
+
+def fetch_attachment_content(mailbox: str, graph_message_id: str, attachment_id: str) -> Optional[bytes]:
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments/{attachment_id}/$value"
+    try:
+        return graph_get_bytes(url)
+    except Exception as e:
+        logging.error("fetch_attachment_content failed [msg=%s att=%s]: %s",
+                      graph_message_id, attachment_id, e)
+        return None
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def normalize_name(name: str) -> str:
+    """Normalizuje název pro porovnání — lowercase, bez diakritiky, jen alnum+._-"""
+    nfkd = unicodedata.normalize("NFKD", name.lower().strip())
+    ascii_str = "".join(c for c in nfkd if not unicodedata.combining(c))
+    return re.sub(r"[^\w.\-]", "_", ascii_str)
+
+
+def find_graph_att(att_name: str, att_size: int, graph_atts: list[dict]) -> Optional[dict]:
+    """
+    Hleda prilohu v Graph listu.
+    1. Presna shoda jmena
+    2. Normalizovana shoda jmena
+    3. Shoda jmena + velikosti (toleruje drobne rozdily v nazvu)
+    """
+    # 1. Presna shoda
+    for ga in graph_atts:
+        if ga["name"] == att_name:
+            return ga
+
+    norm_want = normalize_name(att_name)
+
+    # 2. Normalizovana shoda
+    for ga in graph_atts:
+        if normalize_name(ga["name"]) == norm_want:
+            return ga
+
+    # 3. Normalizovana shoda + velikost (±10 %)
+    for ga in graph_atts:
+        if normalize_name(ga["name"]) == norm_want:
+            ga_size = ga.get("size", 0)
+            if att_size == 0 or ga_size == 0 or abs(ga_size - att_size) / max(ga_size, att_size) < 0.1:
+                return ga
+
+    # 4. Castecna shoda sufixu (posledních 20 znaků normalizovaného jména)
+    for ga in graph_atts:
+        if norm_want[-20:] and normalize_name(ga["name"]).endswith(norm_want[-20:]):
+            return ga
+
+    return None
+
+
+def sha256(data: bytes) -> str:
+    return hashlib.sha256(data).hexdigest()
+
+
+def safe_filename(name: str) -> str:
+    safe = "".join(c if c.isalnum() or c in "._- ()" else "_" for c in name).strip()
+    return safe or "attachment"
+
+
+def resolve_filename(desired_name: str, att_dir: Path, hash_val: str, col_index) -> str:
+    existing = col_index.find_one({"filename": desired_name})
+    if existing:
+        if existing["_id"] == hash_val:
+            return desired_name
+        stem   = Path(desired_name).stem
+        suffix = Path(desired_name).suffix
+        n = 2
+        while True:
+            candidate = f"{stem}_{n}{suffix}"
+            ex2 = col_index.find_one({"filename": candidate})
+            if not ex2 or ex2["_id"] == hash_val:
+                if not (att_dir / candidate).exists() or (ex2 and ex2["_id"] == hash_val):
+                    return candidate
+            n += 1
+    return desired_name
+
+
+def save_attachment(
+    content: bytes,
+    original_name: str,
+    mime_type: str,
+    mailbox: str,
+    att_dir: Path,
+    col_index,
+) -> tuple[str, str, bool]:
+    hash_val = sha256(content)
+
+    existing = col_index.find_one({"_id": hash_val})
+    if existing:
+        col_index.update_one({"_id": hash_val}, {"$inc": {"ref_count": 1}})
+        return hash_val, existing["local_path"], False
+
+    filename  = resolve_filename(safe_filename(original_name), att_dir, hash_val, col_index)
+    file_path = att_dir / filename
+    file_path.write_bytes(content)
+
+    col_index.insert_one({
+        "_id":           hash_val,
+        "filename":      filename,
+        "local_path":    filename,
+        "size_bytes":    len(content),
+        "mime_type":     mime_type,
+        "mailbox":       mailbox,
+        "first_seen_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        "ref_count":     1,
+    })
+
+    return hash_val, filename, True
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"download_attachments v{SCRIPT_VERSION}")
+    ap.add_argument("--mailbox",       required=True,
+                    help="Emailova schranka (napr. ordinace@buzalkova.cz)")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N emailu (0 = vse)")
+    ap.add_argument("--force-recheck", action="store_true",
+                    help="Znovu overi i emaily kde prilohy uz maji file_hash")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na attachments_index kolekci")
+    args = ap.parse_args()
+
+    mailbox   = args.mailbox
+    att_dir   = EMAILS_BASE_DIR / mailbox / "Attachments"
+    mongo_col = mailbox
+
+    start = datetime.now()
+    print(f"=== download_attachments v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {mailbox}")
+    print(f"Cilovy adresar: {att_dir}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{mongo_col}")
+
+    att_dir.mkdir(parents=True, exist_ok=True)
+    print("  Adresar OK")
+
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+
+    col_emails = client[MONGO_DB][mongo_col]
+    col_index  = client[MONGO_DB][MONGO_COL_INDEX]
+
+    if not args.no_indexes:
+        col_index.create_index("filename")
+        col_index.create_index("mime_type")
+        col_index.create_index("mailbox")
+
+    if args.force_recheck:
+        query = {"has_attachments": True}
+    else:
+        query = {
+            "has_attachments": True,
+            "attachments": {
+                "$elemMatch": {
+                    "is_inline": False,
+                    "file_hash": {"$exists": False},
+                }
+            }
+        }
+
+    total = col_emails.count_documents(query)
+    print(f"\nEmailu ke zpracovani: {total}")
+    if total == 0:
+        print("Neni co stahnout.")
+        client.close()
+        return
+
+    cursor = col_emails.find(query, {"_id": 1, "graph_id": 1, "subject": 1, "attachments": 1})
+    if args.limit:
+        cursor = cursor.limit(args.limit)
+
+    ok_count  = 0
+    new_count = 0
+    dup_count = 0
+    skip_count = 0
+    err_count = 0
+    email_i   = 0
+    batch     = []
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col_emails.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    for email_doc in cursor:
+        email_i  += 1
+        email_id  = email_doc["_id"]
+        graph_id  = email_doc.get("graph_id", "")
+        subject   = (email_doc.get("subject") or "")[:60]
+        att_list  = email_doc.get("attachments") or []
+
+        real_atts = [a for a in att_list if not a.get("is_inline", False)]
+        if not real_atts:
+            continue
+
+        print(f"\n  {email_i:>5}/{total}  {subject}")
+
+        # Nacti VSECHNY prilohy z Graph (vcetne inline — potrebujeme je pro matching)
+        graph_atts = fetch_message_attachments(mailbox, graph_id)
+
+        updated_atts = list(att_list)
+        email_ok     = True
+
+        for i, att in enumerate(updated_atts):
+            if att.get("is_inline", False):
+                continue
+            if not args.force_recheck and att.get("file_hash"):
+                continue
+
+            att_name = att.get("filename", "")
+            att_size = att.get("size_bytes", 0)
+
+            # Preskoc S/MIME podpisy
+            if Path(att_name).suffix.lower() in SKIP_EXTENSIONS:
+                updated_atts[i] = {**att, "file_hash": "skip", "local_path": ""}
+                skip_count += 1
+                print(f"         SKIP  {att_name} (S/MIME)")
+                continue
+
+            # Najdi prilohu v Graph
+            graph_att = find_graph_att(att_name, att_size, graph_atts)
+
+            if not graph_att:
+                logging.error("attachment not found [email=%s att=%s]", email_id, att_name)
+                print(f"         ERR   {att_name} (nenalezeno)")
+                err_count += 1
+                email_ok = False
+                continue
+
+            # Pokud Graph rika ze je inline — preskoc, nestahujem
+            if graph_att.get("isInline", False):
+                updated_atts[i] = {**att, "is_inline": True, "file_hash": "skip", "local_path": ""}
+                skip_count += 1
+                print(f"         SKIP  {att_name} (inline obrazek)")
+                continue
+
+            content = fetch_attachment_content(mailbox, graph_id, graph_att["id"])
+            if content is None:
+                err_count += 1
+                email_ok = False
+                print(f"         ERR   {att_name} (stazeni selhalo)")
+                continue
+
+            mime_type = att.get("mime_type") or graph_att.get("contentType", "")
+            hash_val, local_path, was_new = save_attachment(
+                content, att_name, mime_type, mailbox, att_dir, col_index
+            )
+
+            updated_atts[i] = {**att, "file_hash": hash_val, "local_path": local_path}
+
+            if was_new:
+                new_count += 1
+                print(f"         NEW   {local_path}  ({len(content):,} B)")
+            else:
+                dup_count += 1
+                print(f"         DUP   {att_name} -> {local_path}")
+
+        if email_ok:
+            ok_count += 1
+
+        batch.append(UpdateOne({"_id": email_id}, {"$set": {"attachments": updated_atts}}))
+
+        if len(batch) >= BATCH_SIZE:
+            flush()
+
+        if email_i % 100 == 0:
+            elapsed = (datetime.now() - start).total_seconds()
+            print(f"  {'─'*60}")
+            print(f"  Průběh: emaily={email_i}/{total}  nove={new_count}  dup={dup_count}  skip={skip_count}  err={err_count}")
+            print(f"  {'─'*60}")
+
+    flush()
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    files_total   = col_index.count_documents({})
+    size_total    = sum(d.get("size_bytes", 0) for d in col_index.find({}, {"size_bytes": 1}))
+
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  emaily={ok_count}  |  nove={new_count}  |  dup={dup_count}  |  skip={skip_count}  |  err={err_count}")
+    print(f"Souboru v indexu: {files_total}  ({size_total / 1024 / 1024:.1f} MB)")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,560 @@
+"""
+parse_emails_graph_v1.0.py
+Nazev:  parse_emails_graph_v1.0.py
+Verze:  1.0
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Cte vsechny emaily ze schranky ordinace@buzalkova.cz primo pres
+    Microsoft Graph API a importuje je jako dokumenty do MongoDB.
+    Ze kazde zpravy extrahuje vsechny dostupne vlastnosti:
+
+        - predmet, odesilatel, prijemci (To/CC/BCC s typy)
+        - cas doruceni, odeslani, vytvoreni, modifikace (UTC)
+        - telo HTML (max 2 MB) + textovy preview
+        - prilohy (metadata: jmeno, velikost, MIME typ, inline flag)
+        - internet headers (SPF, DKIM, Received, X-*, ...)
+        - MAPI-ekvivalenty: dulezitost, priznak, konverzacni vlakno,
+          kategorie, In-Reply-To, References, ...
+        - navic: isRead, isDraft, folder_path, inferenceClassification
+
+    Prochazi VSECHNY slozky schranky rekurzivne (Inbox, Sent, Deleted,
+    archivni slozky, ...).
+
+    DB:       emaily
+    Kolekce:  ordinace@buzalkova.cz
+    _id:      Internet Message-ID (nebo "graphid:<id>" jako fallback)
+
+    Bezpecne prerusit a opakovat:
+        - upsert podle _id — duplicity se automaticky prepisi
+        - --skip-existing nacte seznam hotovych _id z MongoDB a preskoci je
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python parse_emails_graph_v1.0.py                    # kompletni import
+    python parse_emails_graph_v1.0.py --limit 50         # test na prvnich 50
+    python parse_emails_graph_v1.0.py --skip-existing    # pokracovani po preruseni
+    python parse_emails_graph_v1.0.py --folder Inbox     # jen jedna slozka
+    python parse_emails_graph_v1.0.py --no-indexes       # bez indexu na konci
+
+Zavislosti:
+    msal, requests, pymongo, python-dateutil
+    Python 3.10+
+
+Struktura dokumentu v MongoDB:
+    _id                     Internet Message-ID (nebo graphid: fallback)
+    graph_id                Graph API message ID (pro pripadne dalsi operace)
+    subject                 predmet zpravy
+    normalized_subject      predmet bez RE:/FW:/AW: prefixu
+    importance              0=nizka 1=normalni 2=vysoka
+    flag_status             0=bez priznaku 1=oznaceno 2=dokonceno
+    is_read                 bool — aktualni stav precteni ve schrance
+    is_draft                bool
+    has_attachments         bool
+    attachment_count        int
+    inference_classification focused / other (Outlook AI trideni)
+    categories              [str]
+    conversation_id         Graph conversationId
+    conversation_index      base64 conversationIndex
+    conversation_topic      tema vlakna (z internet headers Thread-Topic)
+    in_reply_to             Message-ID predchozi zpravy
+    internet_references     [Message-ID] — cela historia vlakna
+    received_at             datetime UTC
+    sent_at                 datetime UTC
+    created_at              datetime UTC — cas vytvoreni zaznamu v M365
+    modified_at             datetime UTC — cas posledni modifikace
+    folder_id               Graph parentFolderId
+    folder_path             cela cesta slozky (napr. Inbox/Subfolder)
+    sender.email            emailova adresa odesilatele
+    sender.name             zobrazovane jmeno odesilatele
+    to                      retezec To (joined)
+    cc                      retezec CC
+    bcc                     retezec BCC
+    recipients              [{type, email, name}] — to/cc/bcc s typy
+    body_html               HTML telo (max 2 MB)
+    body_preview            textovy nahled (max 255 znaku z Graph)
+    attachments             [{filename, size_bytes, mime_type,
+                              content_id, is_inline}]
+    headers                 dict internet headers (lowercase_s_podtrzitky)
+    parsed_at               datetime UTC — cas parsovani
+
+Indexy:
+    received_at, sent_at, sender.email, graph_id (unique),
+    conversation_id, folder_path, has_attachments, categories,
+    importance, flag_status, is_read,
+    text_search (subject + body_preview + to + cc)
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze — Graph API jako zdroj
+"""
+
+import sys
+import re
+import logging
+import argparse
+import base64
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from dateutil import parser as dtparser
+from pymongo import MongoClient, UpdateOne, ASCENDING, TEXT
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_MAILBOX       = "ordinace@buzalkova.cz"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI      = "mongodb://192.168.1.76:27017"
+MONGO_DB       = "emaily"
+MONGO_COL      = "ordinace@buzalkova.cz"
+BATCH_SIZE     = 100
+PAGE_SIZE      = 50
+LOG_FILE       = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION = "1.0"
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+IMPORTANCE_MAP  = {"low": 0, "normal": 1, "high": 2}
+FLAG_STATUS_MAP = {"notFlagged": 0, "flagged": 1, "complete": 2}
+RE_SUBJECT      = re.compile(r"^(RE|FW|AW|SV|VS|TR|WG|odpov[eě]d[ťt]|fwd?)[:\s]+", re.IGNORECASE)
+
+MSG_SELECT = (
+    "id,internetMessageId,subject,bodyPreview,body,"
+    "importance,isRead,isDraft,hasAttachments,"
+    "receivedDateTime,sentDateTime,createdDateTime,lastModifiedDateTime,"
+    "sender,from,toRecipients,ccRecipients,bccRecipients,replyTo,"
+    "conversationId,conversationIndex,parentFolderId,"
+    "categories,flag,inferenceClassification,internetMessageHeaders"
+)
+
+
+# ─── Graph API helpers ────────────────────────────────────────────────────────
+
+_graph_token: Optional[str] = None
+
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET failed after retry: {url}")
+
+
+def get_all_folders(parent_id: str = None, parent_path: str = "") -> list[dict]:
+    """Rekurzivne nacte vsechny slozky schranky. Vraci [{id, path}]."""
+    if parent_id is None:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders"
+    else:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{parent_id}/childFolders"
+
+    folders = []
+    params = {"$top": 100, "$select": "id,displayName,childFolderCount"}
+    while url:
+        data = graph_get(url, params)
+        for f in data.get("value", []):
+            path = f"{parent_path}/{f['displayName']}".lstrip("/")
+            folders.append({"id": f["id"], "path": path})
+            if f.get("childFolderCount", 0) > 0:
+                folders.extend(get_all_folders(f["id"], path))
+        url = data.get("@odata.nextLink")
+        params = None
+    return folders
+
+
+def iter_folder_messages(folder_id: str):
+    """Generator: vraci zpravy ze slozky po strankach."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{folder_id}/messages"
+    params = {"$top": PAGE_SIZE, "$select": MSG_SELECT, "$expand": "attachments"}
+    while url:
+        data = graph_get(url, params)
+        for msg in data.get("value", []):
+            yield msg
+        url = data.get("@odata.nextLink")
+        params = None
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def parse_date(raw) -> Optional[datetime]:
+    if raw is None:
+        return None
+    if isinstance(raw, datetime):
+        if raw.tzinfo:
+            return raw.astimezone(timezone.utc).replace(tzinfo=None)
+        return raw
+    try:
+        dt = dtparser.parse(str(raw))
+        if dt.tzinfo:
+            return dt.astimezone(timezone.utc).replace(tzinfo=None)
+        return dt
+    except Exception:
+        return None
+
+
+def normalize_subject(subject: str) -> str:
+    s = subject.strip()
+    while True:
+        m = RE_SUBJECT.match(s)
+        if not m:
+            break
+        s = s[m.end():].strip()
+    return s
+
+
+def parse_headers(raw_headers: list) -> dict:
+    result = {}
+    for h in raw_headers:
+        k = h["name"].lower().replace("-", "_")
+        v = h["value"]
+        if k in result:
+            existing = result[k]
+            if isinstance(existing, list):
+                existing.append(v)
+            else:
+                result[k] = [existing, v]
+        else:
+            result[k] = v
+    return result
+
+
+def format_recipients(lst: list) -> str:
+    return "; ".join(
+        f'{r["emailAddress"].get("name", "")} <{r["emailAddress"].get("address", "")}>'.strip()
+        for r in lst
+    )
+
+
+# ─── Hlavní extrakce ─────────────────────────────────────────────────────────
+
+def extract_message(msg: dict, folder_path: str) -> Optional[dict]:
+    try:
+        # _id
+        mid = (msg.get("internetMessageId") or "").strip()
+        if not mid:
+            mid = f"graphid:{msg['id']}"
+
+        subject = msg.get("subject") or ""
+        norm_subject = normalize_subject(subject)
+
+        # tělo
+        body_html = None
+        body_preview = msg.get("bodyPreview") or ""
+        body = msg.get("body", {})
+        if body.get("contentType") == "html":
+            content = body.get("content") or ""
+            body_html = content if len(content) <= 2 * 1024 * 1024 else content[:2 * 1024 * 1024]
+        elif body.get("contentType") == "text":
+            body_preview = (body.get("content") or "")[:2000]
+
+        # odesílatel
+        sender_ea = (msg.get("from") or msg.get("sender") or {}).get("emailAddress", {})
+        sender_email = sender_ea.get("address", "")
+        sender_name  = sender_ea.get("name", "")
+
+        # příjemci
+        to_list  = msg.get("toRecipients", [])
+        cc_list  = msg.get("ccRecipients", [])
+        bcc_list = msg.get("bccRecipients", [])
+
+        recipients = (
+            [{"type": "to",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in to_list] +
+            [{"type": "cc",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in cc_list] +
+            [{"type": "bcc", "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in bcc_list]
+        )
+
+        # příznaky
+        importance  = IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1)
+        flag_status = FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0)
+
+        # internet headers
+        raw_headers = msg.get("internetMessageHeaders") or []
+        headers = parse_headers(raw_headers)
+
+        in_reply_to = headers.get("in_reply_to", "")
+        if isinstance(in_reply_to, list):
+            in_reply_to = in_reply_to[0]
+
+        refs_raw = headers.get("references", "")
+        if isinstance(refs_raw, list):
+            refs_raw = " ".join(refs_raw)
+        internet_refs = [r.strip() for r in refs_raw.split() if r.strip()] if refs_raw else []
+
+        conv_topic = headers.get("thread_topic", "")
+        if isinstance(conv_topic, list):
+            conv_topic = conv_topic[0]
+
+        # conversation index
+        conv_index = ""
+        ci_raw = msg.get("conversationIndex")
+        if ci_raw:
+            try:
+                conv_index = base64.b64encode(base64.b64decode(ci_raw)).decode()
+            except Exception:
+                conv_index = ci_raw
+
+        # přílohy (jen metadata, bez obsahu)
+        attachments = []
+        for att in msg.get("attachments") or []:
+            fname = att.get("name") or ""
+            if not fname:
+                continue
+            attachments.append({
+                "filename":   fname,
+                "size_bytes": att.get("size", 0),
+                "mime_type":  att.get("contentType", "application/octet-stream"),
+                "content_id": att.get("contentId"),
+                "is_inline":  att.get("isInline", False),
+            })
+
+        return {
+            "_id":     mid,
+            "graph_id": msg["id"],
+
+            "subject":            subject,
+            "normalized_subject": norm_subject,
+            "importance":         importance,
+            "flag_status":        flag_status,
+            "is_read":            msg.get("isRead", False),
+            "is_draft":           msg.get("isDraft", False),
+            "has_attachments":    msg.get("hasAttachments", False),
+            "attachment_count":   len(attachments),
+            "inference_classification": msg.get("inferenceClassification", ""),
+            "categories":         msg.get("categories") or [],
+
+            "conversation_id":    msg.get("conversationId", ""),
+            "conversation_index": conv_index,
+            "conversation_topic": conv_topic,
+            "in_reply_to":        in_reply_to,
+            "internet_references": internet_refs,
+
+            "received_at": parse_date(msg.get("receivedDateTime")),
+            "sent_at":     parse_date(msg.get("sentDateTime")),
+            "created_at":  parse_date(msg.get("createdDateTime")),
+            "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+
+            "folder_id":   msg.get("parentFolderId", ""),
+            "folder_path": folder_path,
+
+            "sender": {
+                "email": sender_email,
+                "name":  sender_name,
+            },
+            "to":         format_recipients(to_list),
+            "cc":         format_recipients(cc_list),
+            "bcc":        format_recipients(bcc_list),
+            "recipients": recipients,
+
+            "body_html":    body_html,
+            "body_preview": body_preview,
+
+            "attachments": attachments,
+            "headers":     headers,
+
+            "parsed_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        }
+
+    except Exception as e:
+        logging.error("extract_message failed [%s]: %s", msg.get("id", "?"), e)
+        return None
+
+
+# ─── MongoDB indexy ───────────────────────────────────────────────────────────
+
+def create_indexes(col):
+    print("  Vytvarim indexy...")
+    col.create_index([("received_at",    ASCENDING)])
+    col.create_index([("sent_at",        ASCENDING)])
+    col.create_index([("sender.email",   ASCENDING)])
+    col.create_index([("graph_id",       ASCENDING)], unique=True, sparse=True)
+    col.create_index([("conversation_id", ASCENDING)])
+    col.create_index([("folder_path",    ASCENDING)])
+    col.create_index([("has_attachments", ASCENDING)])
+    col.create_index([("categories",     ASCENDING)])
+    col.create_index([("importance",     ASCENDING)])
+    col.create_index([("flag_status",    ASCENDING)])
+    col.create_index([("is_read",        ASCENDING)])
+    col.create_index([
+        ("subject",       TEXT),
+        ("body_preview",  TEXT),
+        ("to",            TEXT),
+        ("cc",            TEXT),
+    ], name="text_search", default_language="none")
+    print("  Indexy hotovy.")
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"parse_emails_graph v{SCRIPT_VERSION}")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N zprav (0 = vse)")
+    ap.add_argument("--skip-existing", action="store_true",
+                    help="Preskocit zpravy ktere jiz jsou v MongoDB")
+    ap.add_argument("--folder",        default="",
+                    help="Zpracovat jen slozku se zadanym nazvem (napr. Inbox)")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    start = datetime.now()
+    print(f"=== parse_emails_graph v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {GRAPH_MAILBOX}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{MONGO_COL}")
+
+    # Graph token
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    # MongoDB
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+    col = client[MONGO_DB][MONGO_COL]
+
+    # Skip existing
+    existing: set = set()
+    if args.skip_existing:
+        print("  Nacitam existujici zaznamy z MongoDB...")
+        existing = set(col.distinct("_id"))
+        print(f"  {len(existing)} jiz importovano")
+
+    # Slozky
+    print("\nNacitam seznam slozek...")
+    all_folders = get_all_folders()
+    if args.folder:
+        all_folders = [f for f in all_folders if args.folder.lower() in f["path"].lower()]
+    print(f"  Slozek ke zpracovani: {len(all_folders)}")
+    for f in all_folders:
+        print(f"    {f['path']}")
+
+    # Import
+    batch     = []
+    ok_count  = 0
+    err_count = 0
+    skip_count = 0
+    total_i   = 0
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    print()
+    for folder in all_folders:
+        print(f"--- Složka: {folder['path']} ---")
+        folder_count = 0
+
+        for msg in iter_folder_messages(folder["id"]):
+            if args.limit and total_i >= args.limit:
+                break
+
+            mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+
+            if mid in existing:
+                skip_count += 1
+                total_i += 1
+                continue
+
+            doc = extract_message(msg, folder["path"])
+            total_i += 1
+            folder_count += 1
+
+            if doc is None:
+                err_count += 1
+            else:
+                batch.append(UpdateOne({"_id": doc["_id"]}, {"$set": doc}, upsert=True))
+                ok_count += 1
+
+            if len(batch) >= BATCH_SIZE:
+                flush()
+
+            status      = "ERR " if doc is None else "OK  "
+            subject_str = (doc.get("subject") or "")[:60] if doc else "?"
+            sender_str  = (doc.get("sender", {}).get("email") or "")[:40] if doc else "?"
+            print(f"  {total_i:>6}  {status}  {subject_str:<60}  {sender_str}")
+
+            if total_i % 500 == 0:
+                elapsed = (datetime.now() - start).total_seconds()
+                rate    = total_i / elapsed if elapsed > 0 else 0
+                print(f"  {'─'*80}")
+                print(f"  Průběh: ok={ok_count}  skip={skip_count}  err={err_count}  {rate:.1f} msg/s")
+                print(f"  {'─'*80}")
+
+        flush()
+        print(f"  → {folder_count} zprav ze slozky {folder['path']}")
+
+        if args.limit and total_i >= args.limit:
+            break
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  ok={ok_count}  |  skip={skip_count}  |  err={err_count}")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"Dokumentu v kolekci: {col.count_documents({})}")
+
+    if not args.no_indexes:
+        print()
+        create_indexes(col)
+
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,605 @@
+"""
+parse_emails_graph_v1.1.py
+Nazev:  parse_emails_graph_v1.1.py
+Verze:  1.1
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Cte vsechny emaily ze schranky ordinace@buzalkova.cz primo pres
+    Microsoft Graph API a importuje je jako dokumenty do MongoDB.
+    Ze kazde zpravy extrahuje vsechny dostupne vlastnosti:
+
+        - predmet, odesilatel, prijemci (To/CC/BCC s typy)
+        - cas doruceni, odeslani, vytvoreni, modifikace (UTC)
+        - telo HTML (max 2 MB) + textovy preview
+        - prilohy (metadata: jmeno, velikost, MIME typ, inline flag)
+        - internet headers (SPF, DKIM, Received, X-*, ...)
+        - MAPI-ekvivalenty: dulezitost, priznak, konverzacni vlakno,
+          kategorie, In-Reply-To, References, ...
+        - navic: isRead, isDraft, folder_path, inferenceClassification
+
+    Prochazi VSECHNY slozky schranky rekurzivne (Inbox, Sent, Deleted,
+    archivni slozky, ...).
+
+    DB:       emaily
+    Kolekce:  ordinace@buzalkova.cz
+    _id:      Internet Message-ID (nebo "graphid:<id>" jako fallback)
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    # Prvni import (vsechno):
+    python parse_emails_graph_v1.1.py
+
+    # Test na prvnich 50:
+    python parse_emails_graph_v1.1.py --limit 50 --no-indexes
+
+    # Jen jedna slozka:
+    python parse_emails_graph_v1.1.py --folder Inbox
+
+    # Pokracovani po preruseni (pouze nove):
+    python parse_emails_graph_v1.1.py --mode new-only
+
+    # Pravidelny sync (aktualizuje is_read, flag, slozku; importuje nove):
+    python parse_emails_graph_v1.1.py --mode sync
+
+    # Plny reimport vsech dat:
+    python parse_emails_graph_v1.1.py --mode full
+
+Rezimy (--mode):
+    full      Plny upsert vsech poli pro kazdou zpravu (vychozi)
+    new-only  Preskoci zpravy ktere uz jsou v MongoDB, importuje jen nove
+    sync      Existujici: aktualizuje jen is_read/flag_status/categories/
+              modified_at/folder_path. Nove zpravy importuje cely.
+              Idealni pro pravidelne spousteni.
+
+Zavislosti:
+    msal, requests, pymongo, python-dateutil
+    Python 3.10+
+
+Struktura dokumentu v MongoDB:
+    _id                     Internet Message-ID (nebo graphid: fallback)
+    graph_id                Graph API message ID
+    subject                 predmet zpravy
+    normalized_subject      predmet bez RE:/FW:/AW: prefixu
+    importance              0=nizka 1=normalni 2=vysoka
+    flag_status             0=bez priznaku 1=oznaceno 2=dokonceno
+    is_read                 bool — aktualni stav precteni ve schrance
+    is_draft                bool
+    has_attachments         bool
+    attachment_count        int
+    inference_classification focused / other
+    categories              [str]
+    conversation_id         Graph conversationId
+    conversation_index      base64 conversationIndex
+    conversation_topic      tema vlakna (z internet headers Thread-Topic)
+    in_reply_to             Message-ID predchozi zpravy
+    internet_references     [Message-ID]
+    received_at             datetime UTC
+    sent_at                 datetime UTC
+    created_at              datetime UTC
+    modified_at             datetime UTC
+    folder_id               Graph parentFolderId
+    folder_path             cela cesta slozky (napr. Inbox/Subfolder)
+    sender.email            emailova adresa odesilatele
+    sender.name             zobrazovane jmeno
+    to                      retezec To (joined)
+    cc                      retezec CC
+    bcc                     retezec BCC
+    recipients              [{type, email, name}]
+    body_html               HTML telo (max 2 MB)
+    body_preview            textovy nahled (max 255 znaku)
+    attachments             [{filename, size_bytes, mime_type, content_id, is_inline}]
+    headers                 dict internet headers
+    parsed_at               datetime UTC
+
+Indexy:
+    received_at, sent_at, sender.email, graph_id (unique),
+    conversation_id, folder_path, has_attachments, categories,
+    importance, flag_status, is_read,
+    text_search (subject + body_preview + to + cc)
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+    1.1  2026-06-02  Pridany rezimy --mode full/new-only/sync;
+                     odstranen --skip-existing (nahrazen --mode new-only)
+"""
+
+import sys
+import re
+import logging
+import argparse
+import base64
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from dateutil import parser as dtparser
+from pymongo import MongoClient, UpdateOne, ASCENDING, TEXT
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_MAILBOX       = "ordinace@buzalkova.cz"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI      = "mongodb://192.168.1.76:27017"
+MONGO_DB       = "emaily"
+MONGO_COL      = "ordinace@buzalkova.cz"
+BATCH_SIZE     = 100
+PAGE_SIZE      = 50
+LOG_FILE       = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION = "1.1"
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+IMPORTANCE_MAP  = {"low": 0, "normal": 1, "high": 2}
+FLAG_STATUS_MAP = {"notFlagged": 0, "flagged": 1, "complete": 2}
+RE_SUBJECT      = re.compile(r"^(RE|FW|AW|SV|VS|TR|WG|odpov[eě]d[ťt]|fwd?)[:\s]+", re.IGNORECASE)
+
+MSG_SELECT = (
+    "id,internetMessageId,subject,bodyPreview,body,"
+    "importance,isRead,isDraft,hasAttachments,"
+    "receivedDateTime,sentDateTime,createdDateTime,lastModifiedDateTime,"
+    "sender,from,toRecipients,ccRecipients,bccRecipients,replyTo,"
+    "conversationId,conversationIndex,parentFolderId,"
+    "categories,flag,inferenceClassification,internetMessageHeaders"
+)
+
+# Pro sync mode staci jen menitelna pole — rychlejsi fetch
+MSG_SELECT_SYNC = (
+    "id,internetMessageId,isRead,isDraft,flag,categories,"
+    "lastModifiedDateTime,parentFolderId,importance"
+)
+
+
+# ─── Graph API helpers ────────────────────────────────────────────────────────
+
+_graph_token: Optional[str] = None
+
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET failed after retry: {url}")
+
+
+def get_all_folders(parent_id: str = None, parent_path: str = "") -> list[dict]:
+    """Rekurzivne nacte vsechny slozky schranky. Vraci [{id, path}]."""
+    if parent_id is None:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders"
+    else:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{parent_id}/childFolders"
+
+    folders = []
+    params = {"$top": 100, "$select": "id,displayName,childFolderCount"}
+    while url:
+        data = graph_get(url, params)
+        for f in data.get("value", []):
+            path = f"{parent_path}/{f['displayName']}".lstrip("/")
+            folders.append({"id": f["id"], "path": path})
+            if f.get("childFolderCount", 0) > 0:
+                folders.extend(get_all_folders(f["id"], path))
+        url = data.get("@odata.nextLink")
+        params = None
+    return folders
+
+
+def iter_folder_messages(folder_id: str, select: str = MSG_SELECT, expand_attachments: bool = True):
+    """Generator: vraci zpravy ze slozky po strankach."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{folder_id}/messages"
+    params = {"$top": PAGE_SIZE, "$select": select}
+    if expand_attachments:
+        params["$expand"] = "attachments"
+    while url:
+        data = graph_get(url, params)
+        for msg in data.get("value", []):
+            yield msg
+        url = data.get("@odata.nextLink")
+        params = None
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def parse_date(raw) -> Optional[datetime]:
+    if raw is None:
+        return None
+    if isinstance(raw, datetime):
+        if raw.tzinfo:
+            return raw.astimezone(timezone.utc).replace(tzinfo=None)
+        return raw
+    try:
+        dt = dtparser.parse(str(raw))
+        if dt.tzinfo:
+            return dt.astimezone(timezone.utc).replace(tzinfo=None)
+        return dt
+    except Exception:
+        return None
+
+
+def normalize_subject(subject: str) -> str:
+    s = subject.strip()
+    while True:
+        m = RE_SUBJECT.match(s)
+        if not m:
+            break
+        s = s[m.end():].strip()
+    return s
+
+
+def parse_headers(raw_headers: list) -> dict:
+    result = {}
+    for h in raw_headers:
+        k = h["name"].lower().replace("-", "_")
+        v = h["value"]
+        if k in result:
+            existing = result[k]
+            result[k] = existing + [v] if isinstance(existing, list) else [existing, v]
+        else:
+            result[k] = v
+    return result
+
+
+def format_recipients(lst: list) -> str:
+    return "; ".join(
+        f'{r["emailAddress"].get("name", "")} <{r["emailAddress"].get("address", "")}>'.strip()
+        for r in lst
+    )
+
+
+# ─── Extrakce zprávy ─────────────────────────────────────────────────────────
+
+def extract_message(msg: dict, folder_path: str) -> Optional[dict]:
+    """Plna extrakce — pouziva se pro mode full a nove zpravy v sync/new-only."""
+    try:
+        mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+        subject = msg.get("subject") or ""
+
+        body_html = None
+        body_preview = msg.get("bodyPreview") or ""
+        body = msg.get("body", {})
+        if body.get("contentType") == "html":
+            content = body.get("content") or ""
+            body_html = content if len(content) <= 2 * 1024 * 1024 else content[:2 * 1024 * 1024]
+        elif body.get("contentType") == "text":
+            body_preview = (body.get("content") or "")[:2000]
+
+        sender_ea    = (msg.get("from") or msg.get("sender") or {}).get("emailAddress", {})
+        to_list      = msg.get("toRecipients", [])
+        cc_list      = msg.get("ccRecipients", [])
+        bcc_list     = msg.get("bccRecipients", [])
+
+        recipients = (
+            [{"type": "to",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in to_list] +
+            [{"type": "cc",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in cc_list] +
+            [{"type": "bcc", "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in bcc_list]
+        )
+
+        importance  = IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1)
+        flag_status = FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0)
+
+        raw_headers   = msg.get("internetMessageHeaders") or []
+        headers       = parse_headers(raw_headers)
+
+        in_reply_to = headers.get("in_reply_to", "")
+        if isinstance(in_reply_to, list):
+            in_reply_to = in_reply_to[0]
+
+        refs_raw = headers.get("references", "")
+        if isinstance(refs_raw, list):
+            refs_raw = " ".join(refs_raw)
+        internet_refs = [r.strip() for r in refs_raw.split() if r.strip()] if refs_raw else []
+
+        conv_topic = headers.get("thread_topic", "")
+        if isinstance(conv_topic, list):
+            conv_topic = conv_topic[0]
+
+        conv_index = ""
+        ci_raw = msg.get("conversationIndex")
+        if ci_raw:
+            try:
+                conv_index = base64.b64encode(base64.b64decode(ci_raw)).decode()
+            except Exception:
+                conv_index = ci_raw
+
+        attachments = []
+        for att in msg.get("attachments") or []:
+            fname = att.get("name") or ""
+            if not fname:
+                continue
+            attachments.append({
+                "filename":   fname,
+                "size_bytes": att.get("size", 0),
+                "mime_type":  att.get("contentType", "application/octet-stream"),
+                "content_id": att.get("contentId"),
+                "is_inline":  att.get("isInline", False),
+            })
+
+        return {
+            "_id":      mid,
+            "graph_id": msg["id"],
+
+            "subject":            subject,
+            "normalized_subject": normalize_subject(subject),
+            "importance":         importance,
+            "flag_status":        flag_status,
+            "is_read":            msg.get("isRead", False),
+            "is_draft":           msg.get("isDraft", False),
+            "has_attachments":    msg.get("hasAttachments", False),
+            "attachment_count":   len(attachments),
+            "inference_classification": msg.get("inferenceClassification", ""),
+            "categories":         msg.get("categories") or [],
+
+            "conversation_id":     msg.get("conversationId", ""),
+            "conversation_index":  conv_index,
+            "conversation_topic":  conv_topic,
+            "in_reply_to":         in_reply_to,
+            "internet_references": internet_refs,
+
+            "received_at": parse_date(msg.get("receivedDateTime")),
+            "sent_at":     parse_date(msg.get("sentDateTime")),
+            "created_at":  parse_date(msg.get("createdDateTime")),
+            "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+
+            "folder_id":   msg.get("parentFolderId", ""),
+            "folder_path": folder_path,
+
+            "sender": {
+                "email": sender_ea.get("address", ""),
+                "name":  sender_ea.get("name", ""),
+            },
+            "to":         format_recipients(to_list),
+            "cc":         format_recipients(cc_list),
+            "bcc":        format_recipients(bcc_list),
+            "recipients": recipients,
+
+            "body_html":    body_html,
+            "body_preview": body_preview,
+
+            "attachments": attachments,
+            "headers":     headers,
+
+            "parsed_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        }
+
+    except Exception as e:
+        logging.error("extract_message failed [%s]: %s", msg.get("id", "?"), e)
+        return None
+
+
+def extract_sync_fields(msg: dict, folder_path: str) -> dict:
+    """Jen menitelna pole — pouziva se v sync mode pro existujici zpravy."""
+    return {
+        "is_read":    msg.get("isRead", False),
+        "is_draft":   msg.get("isDraft", False),
+        "flag_status": FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0),
+        "importance":  IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1),
+        "categories":  msg.get("categories") or [],
+        "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+        "folder_id":   msg.get("parentFolderId", ""),
+        "folder_path": folder_path,
+        "parsed_at":   datetime.now(timezone.utc).replace(tzinfo=None),
+    }
+
+
+# ─── MongoDB indexy ───────────────────────────────────────────────────────────
+
+def create_indexes(col):
+    print("  Vytvarim indexy...")
+    col.create_index([("received_at",     ASCENDING)])
+    col.create_index([("sent_at",         ASCENDING)])
+    col.create_index([("sender.email",    ASCENDING)])
+    col.create_index([("graph_id",        ASCENDING)], unique=True, sparse=True)
+    col.create_index([("conversation_id", ASCENDING)])
+    col.create_index([("folder_path",     ASCENDING)])
+    col.create_index([("has_attachments", ASCENDING)])
+    col.create_index([("categories",      ASCENDING)])
+    col.create_index([("importance",      ASCENDING)])
+    col.create_index([("flag_status",     ASCENDING)])
+    col.create_index([("is_read",         ASCENDING)])
+    col.create_index([
+        ("subject",      TEXT),
+        ("body_preview", TEXT),
+        ("to",           TEXT),
+        ("cc",           TEXT),
+    ], name="text_search", default_language="none")
+    print("  Indexy hotovy.")
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"parse_emails_graph v{SCRIPT_VERSION}")
+    ap.add_argument("--mode", default="full", choices=["full", "new-only", "sync"],
+                    help="full=plny upsert (vychozi) | new-only=jen nove zpravy | "
+                         "sync=existujici aktualizuje jen menitelna pole, nove importuje cely")
+    ap.add_argument("--limit",      type=int, default=0,
+                    help="Zpracovat max N zprav (0 = vse)")
+    ap.add_argument("--folder",     default="",
+                    help="Zpracovat jen slozku se zadanym nazvem (napr. Inbox)")
+    ap.add_argument("--no-indexes", action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    start = datetime.now()
+    print(f"=== parse_emails_graph v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {GRAPH_MAILBOX}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{MONGO_COL}")
+    print(f"Režim:    {args.mode}")
+
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+    col = client[MONGO_DB][MONGO_COL]
+
+    # Existující _id (potřeba pro new-only a sync)
+    existing: set = set()
+    if args.mode in ("new-only", "sync"):
+        print("  Nacitam existujici zaznamy z MongoDB...")
+        existing = set(col.distinct("_id"))
+        print(f"  {len(existing)} jiz importovano")
+
+    print("\nNacitam seznam slozek...")
+    all_folders = get_all_folders()
+    if args.folder:
+        all_folders = [f for f in all_folders if args.folder.lower() in f["path"].lower()]
+    print(f"  Slozek ke zpracovani: {len(all_folders)}")
+    for f in all_folders:
+        print(f"    {f['path']}")
+
+    # V sync mode fetchujeme jen menitelna pole
+    is_sync    = args.mode == "sync"
+    msg_select = MSG_SELECT_SYNC if is_sync else MSG_SELECT
+    expand_att = not is_sync
+
+    batch      = []
+    ok_count   = 0
+    sync_count = 0
+    err_count  = 0
+    skip_count = 0
+    total_i    = 0
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    print()
+    for folder in all_folders:
+        print(f"--- Složka: {folder['path']} ---")
+        folder_count = 0
+
+        for msg in iter_folder_messages(folder["id"], select=msg_select, expand_attachments=expand_att):
+            if args.limit and total_i >= args.limit:
+                break
+
+            mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+            total_i += 1
+            folder_count += 1
+
+            if args.mode == "new-only" and mid in existing:
+                skip_count += 1
+                continue
+
+            if is_sync and mid in existing:
+                # Sync existujici — jen menitelna pole
+                fields = extract_sync_fields(msg, folder["path"])
+                batch.append(UpdateOne({"_id": mid}, {"$set": fields}))
+                sync_count += 1
+                status = "SYN "
+                print(f"  {total_i:>6}  {status}  {mid[:80]}")
+            else:
+                # Full extract (new-only nove, sync nove, full vse)
+                # Pro sync nove zpravy potrebujeme plny fetch
+                if is_sync:
+                    full_url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/messages/{msg['id']}"
+                    full_params = {"$select": MSG_SELECT, "$expand": "attachments"}
+                    try:
+                        msg = graph_get(full_url, full_params)
+                    except Exception as e:
+                        logging.error("full fetch failed [%s]: %s", msg.get("id","?"), e)
+                        err_count += 1
+                        continue
+
+                doc = extract_message(msg, folder["path"])
+                if doc is None:
+                    err_count += 1
+                    status = "ERR "
+                    print(f"  {total_i:>6}  {status}  {mid[:80]}")
+                else:
+                    batch.append(UpdateOne({"_id": doc["_id"]}, {"$set": doc}, upsert=True))
+                    ok_count += 1
+                    status = "OK  "
+                    subject_str = (doc.get("subject") or "")[:60]
+                    sender_str  = (doc.get("sender", {}).get("email") or "")[:40]
+                    print(f"  {total_i:>6}  {status}  {subject_str:<60}  {sender_str}")
+
+            if len(batch) >= BATCH_SIZE:
+                flush()
+
+            if total_i % 500 == 0:
+                elapsed = (datetime.now() - start).total_seconds()
+                rate    = total_i / elapsed if elapsed > 0 else 0
+                print(f"  {'─'*80}")
+                print(f"  Průběh: ok={ok_count}  sync={sync_count}  skip={skip_count}  err={err_count}  {rate:.1f} msg/s")
+                print(f"  {'─'*80}")
+
+        flush()
+        print(f"  → {folder_count} zprav ze slozky {folder['path']}")
+
+        if args.limit and total_i >= args.limit:
+            break
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  ok={ok_count}  |  sync={sync_count}  |  skip={skip_count}  |  err={err_count}")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"Dokumentu v kolekci: {col.count_documents({})}")
+
+    if not args.no_indexes:
+        print()
+        create_indexes(col)
+
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,610 @@
+"""
+parse_emails_graph_v1.2.py
+Nazev:  parse_emails_graph_v1.2.py
+Verze:  1.2
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Cte vsechny emaily ze schranky ordinace@buzalkova.cz primo pres
+    Microsoft Graph API a importuje je jako dokumenty do MongoDB.
+    Ze kazde zpravy extrahuje vsechny dostupne vlastnosti:
+
+        - predmet, odesilatel, prijemci (To/CC/BCC s typy)
+        - cas doruceni, odeslani, vytvoreni, modifikace (UTC)
+        - telo HTML (max 2 MB) + textovy preview
+        - prilohy (metadata: jmeno, velikost, MIME typ, inline flag, graph_att_id)
+        - internet headers (SPF, DKIM, Received, X-*, ...)
+        - MAPI-ekvivalenty: dulezitost, priznak, konverzacni vlakno,
+          kategorie, In-Reply-To, References, ...
+        - navic: isRead, isDraft, folder_path, inferenceClassification
+
+    Prochazi VSECHNY slozky schranky rekurzivne (Inbox, Sent, Deleted,
+    archivni slozky, ...).
+
+    DB:       emaily
+    Kolekce:  ordinace@buzalkova.cz
+    _id:      Internet Message-ID (nebo "graphid:<id>" jako fallback)
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    # Prvni import (vsechno):
+    python parse_emails_graph_v1.2.py
+
+    # Test na prvnich 50:
+    python parse_emails_graph_v1.2.py --limit 50 --no-indexes
+
+    # Jen jedna slozka:
+    python parse_emails_graph_v1.2.py --folder Inbox
+
+    # Pokracovani po preruseni (pouze nove):
+    python parse_emails_graph_v1.2.py --mode new-only
+
+    # Pravidelny sync (aktualizuje is_read, flag, slozku; importuje nove):
+    python parse_emails_graph_v1.2.py --mode sync
+
+    # Plny reimport vsech dat:
+    python parse_emails_graph_v1.2.py --mode full
+
+Rezimy (--mode):
+    full      Plny upsert vsech poli pro kazdou zpravu (vychozi)
+    new-only  Preskoci zpravy ktere uz jsou v MongoDB, importuje jen nove
+    sync      Existujici: aktualizuje jen is_read/flag_status/categories/
+              modified_at/folder_path. Nove zpravy importuje cely.
+              Idealni pro pravidelne spousteni.
+
+Zavislosti:
+    msal, requests, pymongo, python-dateutil
+    Python 3.10+
+
+Struktura dokumentu v MongoDB:
+    _id                     Internet Message-ID (nebo graphid: fallback)
+    graph_id                Graph API message ID
+    subject                 predmet zpravy
+    normalized_subject      predmet bez RE:/FW:/AW: prefixu
+    importance              0=nizka 1=normalni 2=vysoka
+    flag_status             0=bez priznaku 1=oznaceno 2=dokonceno
+    is_read                 bool — aktualni stav precteni ve schrance
+    is_draft                bool
+    has_attachments         bool
+    attachment_count        int
+    inference_classification focused / other
+    categories              [str]
+    conversation_id         Graph conversationId
+    conversation_index      base64 conversationIndex
+    conversation_topic      tema vlakna (z internet headers Thread-Topic)
+    in_reply_to             Message-ID predchozi zpravy
+    internet_references     [Message-ID]
+    received_at             datetime UTC
+    sent_at                 datetime UTC
+    created_at              datetime UTC
+    modified_at             datetime UTC
+    folder_id               Graph parentFolderId
+    folder_path             cela cesta slozky (napr. Inbox/Subfolder)
+    sender.email            emailova adresa odesilatele
+    sender.name             zobrazovane jmeno
+    to                      retezec To (joined)
+    cc                      retezec CC
+    bcc                     retezec BCC
+    recipients              [{type, email, name}]
+    body_html               HTML telo (max 2 MB)
+    body_preview            textovy nahled (max 255 znaku)
+    attachments             [{filename, size_bytes, mime_type, is_inline, graph_att_id}]
+    headers                 dict internet headers
+    parsed_at               datetime UTC
+
+Indexy:
+    received_at, sent_at, sender.email, graph_id (unique),
+    conversation_id, folder_path, has_attachments, categories,
+    importance, flag_status, is_read,
+    text_search (subject + body_preview + to + cc)
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+    1.1  2026-06-02  Pridany rezimy --mode full/new-only/sync;
+                     odstranen --skip-existing (nahrazen --mode new-only)
+    1.2  2026-06-02  $expand attachments s $select (bez contentBytes — rychlejsi);
+                     prilohy ukladaji graph_att_id pro prime stazeni bez name-matchingu
+"""
+
+import sys
+import re
+import logging
+import argparse
+import base64
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from dateutil import parser as dtparser
+from pymongo import MongoClient, UpdateOne, ASCENDING, TEXT
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_MAILBOX       = "ordinace@buzalkova.cz"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI      = "mongodb://192.168.1.76:27017"
+MONGO_DB       = "emaily"
+MONGO_COL      = "ordinace@buzalkova.cz"
+BATCH_SIZE     = 100
+PAGE_SIZE      = 50
+LOG_FILE       = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION = "1.2"
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+IMPORTANCE_MAP  = {"low": 0, "normal": 1, "high": 2}
+FLAG_STATUS_MAP = {"notFlagged": 0, "flagged": 1, "complete": 2}
+RE_SUBJECT      = re.compile(r"^(RE|FW|AW|SV|VS|TR|WG|odpov[eě]d[ťt]|fwd?)[:\s]+", re.IGNORECASE)
+
+# $expand prilohy bez contentBytes — jen metadata co potrebujeme
+ATT_EXPAND = "attachments($select=id,name,contentType,size,isInline)"
+
+MSG_SELECT = (
+    "id,internetMessageId,subject,bodyPreview,body,"
+    "importance,isRead,isDraft,hasAttachments,"
+    "receivedDateTime,sentDateTime,createdDateTime,lastModifiedDateTime,"
+    "sender,from,toRecipients,ccRecipients,bccRecipients,replyTo,"
+    "conversationId,conversationIndex,parentFolderId,"
+    "categories,flag,inferenceClassification,internetMessageHeaders"
+)
+
+# Pro sync mode staci jen menitelna pole — rychlejsi fetch
+MSG_SELECT_SYNC = (
+    "id,internetMessageId,isRead,isDraft,flag,categories,"
+    "lastModifiedDateTime,parentFolderId,importance"
+)
+
+
+# ─── Graph API helpers ────────────────────────────────────────────────────────
+
+_graph_token: Optional[str] = None
+
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET failed after retry: {url}")
+
+
+def get_all_folders(parent_id: str = None, parent_path: str = "") -> list[dict]:
+    """Rekurzivne nacte vsechny slozky schranky. Vraci [{id, path}]."""
+    if parent_id is None:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders"
+    else:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{parent_id}/childFolders"
+
+    folders = []
+    params = {"$top": 100, "$select": "id,displayName,childFolderCount"}
+    while url:
+        data = graph_get(url, params)
+        for f in data.get("value", []):
+            path = f"{parent_path}/{f['displayName']}".lstrip("/")
+            folders.append({"id": f["id"], "path": path})
+            if f.get("childFolderCount", 0) > 0:
+                folders.extend(get_all_folders(f["id"], path))
+        url = data.get("@odata.nextLink")
+        params = None
+    return folders
+
+
+def iter_folder_messages(folder_id: str, select: str = MSG_SELECT, expand_attachments: bool = True):
+    """Generator: vraci zpravy ze slozky po strankach."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{folder_id}/messages"
+    params = {"$top": PAGE_SIZE, "$select": select}
+    if expand_attachments:
+        params["$expand"] = ATT_EXPAND
+    while url:
+        data = graph_get(url, params)
+        for msg in data.get("value", []):
+            yield msg
+        url = data.get("@odata.nextLink")
+        params = None
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def parse_date(raw) -> Optional[datetime]:
+    if raw is None:
+        return None
+    if isinstance(raw, datetime):
+        if raw.tzinfo:
+            return raw.astimezone(timezone.utc).replace(tzinfo=None)
+        return raw
+    try:
+        dt = dtparser.parse(str(raw))
+        if dt.tzinfo:
+            return dt.astimezone(timezone.utc).replace(tzinfo=None)
+        return dt
+    except Exception:
+        return None
+
+
+def normalize_subject(subject: str) -> str:
+    s = subject.strip()
+    while True:
+        m = RE_SUBJECT.match(s)
+        if not m:
+            break
+        s = s[m.end():].strip()
+    return s
+
+
+def parse_headers(raw_headers: list) -> dict:
+    result = {}
+    for h in raw_headers:
+        k = h["name"].lower().replace("-", "_")
+        v = h["value"]
+        if k in result:
+            existing = result[k]
+            result[k] = existing + [v] if isinstance(existing, list) else [existing, v]
+        else:
+            result[k] = v
+    return result
+
+
+def format_recipients(lst: list) -> str:
+    return "; ".join(
+        f'{r["emailAddress"].get("name", "")} <{r["emailAddress"].get("address", "")}>'.strip()
+        for r in lst
+    )
+
+
+# ─── Extrakce zprávy ─────────────────────────────────────────────────────────
+
+def extract_message(msg: dict, folder_path: str) -> Optional[dict]:
+    """Plna extrakce — pouziva se pro mode full a nove zpravy v sync/new-only."""
+    try:
+        mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+        subject = msg.get("subject") or ""
+
+        body_html = None
+        body_preview = msg.get("bodyPreview") or ""
+        body = msg.get("body", {})
+        if body.get("contentType") == "html":
+            content = body.get("content") or ""
+            body_html = content if len(content) <= 2 * 1024 * 1024 else content[:2 * 1024 * 1024]
+        elif body.get("contentType") == "text":
+            body_preview = (body.get("content") or "")[:2000]
+
+        sender_ea    = (msg.get("from") or msg.get("sender") or {}).get("emailAddress", {})
+        to_list      = msg.get("toRecipients", [])
+        cc_list      = msg.get("ccRecipients", [])
+        bcc_list     = msg.get("bccRecipients", [])
+
+        recipients = (
+            [{"type": "to",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in to_list] +
+            [{"type": "cc",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in cc_list] +
+            [{"type": "bcc", "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in bcc_list]
+        )
+
+        importance  = IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1)
+        flag_status = FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0)
+
+        raw_headers   = msg.get("internetMessageHeaders") or []
+        headers       = parse_headers(raw_headers)
+
+        in_reply_to = headers.get("in_reply_to", "")
+        if isinstance(in_reply_to, list):
+            in_reply_to = in_reply_to[0]
+
+        refs_raw = headers.get("references", "")
+        if isinstance(refs_raw, list):
+            refs_raw = " ".join(refs_raw)
+        internet_refs = [r.strip() for r in refs_raw.split() if r.strip()] if refs_raw else []
+
+        conv_topic = headers.get("thread_topic", "")
+        if isinstance(conv_topic, list):
+            conv_topic = conv_topic[0]
+
+        conv_index = ""
+        ci_raw = msg.get("conversationIndex")
+        if ci_raw:
+            try:
+                conv_index = base64.b64encode(base64.b64decode(ci_raw)).decode()
+            except Exception:
+                conv_index = ci_raw
+
+        attachments = []
+        for att in msg.get("attachments") or []:
+            fname = att.get("name") or ""
+            if not fname:
+                continue
+            attachments.append({
+                "filename":     fname,
+                "size_bytes":   att.get("size", 0),
+                "mime_type":    att.get("contentType", "application/octet-stream"),
+                "is_inline":    att.get("isInline", False),
+                "graph_att_id": att.get("id"),
+            })
+
+        return {
+            "_id":      mid,
+            "graph_id": msg["id"],
+
+            "subject":            subject,
+            "normalized_subject": normalize_subject(subject),
+            "importance":         importance,
+            "flag_status":        flag_status,
+            "is_read":            msg.get("isRead", False),
+            "is_draft":           msg.get("isDraft", False),
+            "has_attachments":    msg.get("hasAttachments", False),
+            "attachment_count":   len(attachments),
+            "inference_classification": msg.get("inferenceClassification", ""),
+            "categories":         msg.get("categories") or [],
+
+            "conversation_id":     msg.get("conversationId", ""),
+            "conversation_index":  conv_index,
+            "conversation_topic":  conv_topic,
+            "in_reply_to":         in_reply_to,
+            "internet_references": internet_refs,
+
+            "received_at": parse_date(msg.get("receivedDateTime")),
+            "sent_at":     parse_date(msg.get("sentDateTime")),
+            "created_at":  parse_date(msg.get("createdDateTime")),
+            "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+
+            "folder_id":   msg.get("parentFolderId", ""),
+            "folder_path": folder_path,
+
+            "sender": {
+                "email": sender_ea.get("address", ""),
+                "name":  sender_ea.get("name", ""),
+            },
+            "to":         format_recipients(to_list),
+            "cc":         format_recipients(cc_list),
+            "bcc":        format_recipients(bcc_list),
+            "recipients": recipients,
+
+            "body_html":    body_html,
+            "body_preview": body_preview,
+
+            "attachments": attachments,
+            "headers":     headers,
+
+            "parsed_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        }
+
+    except Exception as e:
+        logging.error("extract_message failed [%s]: %s", msg.get("id", "?"), e)
+        return None
+
+
+def extract_sync_fields(msg: dict, folder_path: str) -> dict:
+    """Jen menitelna pole — pouziva se v sync mode pro existujici zpravy."""
+    return {
+        "is_read":    msg.get("isRead", False),
+        "is_draft":   msg.get("isDraft", False),
+        "flag_status": FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0),
+        "importance":  IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1),
+        "categories":  msg.get("categories") or [],
+        "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+        "folder_id":   msg.get("parentFolderId", ""),
+        "folder_path": folder_path,
+        "parsed_at":   datetime.now(timezone.utc).replace(tzinfo=None),
+    }
+
+
+# ─── MongoDB indexy ───────────────────────────────────────────────────────────
+
+def create_indexes(col):
+    print("  Vytvarim indexy...")
+    col.create_index([("received_at",     ASCENDING)])
+    col.create_index([("sent_at",         ASCENDING)])
+    col.create_index([("sender.email",    ASCENDING)])
+    col.create_index([("graph_id",        ASCENDING)], unique=True, sparse=True)
+    col.create_index([("conversation_id", ASCENDING)])
+    col.create_index([("folder_path",     ASCENDING)])
+    col.create_index([("has_attachments", ASCENDING)])
+    col.create_index([("categories",      ASCENDING)])
+    col.create_index([("importance",      ASCENDING)])
+    col.create_index([("flag_status",     ASCENDING)])
+    col.create_index([("is_read",         ASCENDING)])
+    col.create_index([
+        ("subject",      TEXT),
+        ("body_preview", TEXT),
+        ("to",           TEXT),
+        ("cc",           TEXT),
+    ], name="text_search", default_language="none")
+    print("  Indexy hotovy.")
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"parse_emails_graph v{SCRIPT_VERSION}")
+    ap.add_argument("--mode", default="full", choices=["full", "new-only", "sync"],
+                    help="full=plny upsert (vychozi) | new-only=jen nove zpravy | "
+                         "sync=existujici aktualizuje jen menitelna pole, nove importuje cely")
+    ap.add_argument("--limit",      type=int, default=0,
+                    help="Zpracovat max N zprav (0 = vse)")
+    ap.add_argument("--folder",     default="",
+                    help="Zpracovat jen slozku se zadanym nazvem (napr. Inbox)")
+    ap.add_argument("--no-indexes", action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    start = datetime.now()
+    print(f"=== parse_emails_graph v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {GRAPH_MAILBOX}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{MONGO_COL}")
+    print(f"Režim:    {args.mode}")
+
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+    col = client[MONGO_DB][MONGO_COL]
+
+    # Existující _id (potřeba pro new-only a sync)
+    existing: set = set()
+    if args.mode in ("new-only", "sync"):
+        print("  Nacitam existujici zaznamy z MongoDB...")
+        existing = set(col.distinct("_id"))
+        print(f"  {len(existing)} jiz importovano")
+
+    print("\nNacitam seznam slozek...")
+    all_folders = get_all_folders()
+    if args.folder:
+        all_folders = [f for f in all_folders if args.folder.lower() in f["path"].lower()]
+    print(f"  Slozek ke zpracovani: {len(all_folders)}")
+    for f in all_folders:
+        print(f"    {f['path']}")
+
+    # V sync mode fetchujeme jen menitelna pole
+    is_sync    = args.mode == "sync"
+    msg_select = MSG_SELECT_SYNC if is_sync else MSG_SELECT
+    expand_att = not is_sync
+
+    batch      = []
+    ok_count   = 0
+    sync_count = 0
+    err_count  = 0
+    skip_count = 0
+    total_i    = 0
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    print()
+    for folder in all_folders:
+        print(f"--- Složka: {folder['path']} ---")
+        folder_count = 0
+
+        for msg in iter_folder_messages(folder["id"], select=msg_select, expand_attachments=expand_att):
+            if args.limit and total_i >= args.limit:
+                break
+
+            mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+            total_i += 1
+            folder_count += 1
+
+            if args.mode == "new-only" and mid in existing:
+                skip_count += 1
+                continue
+
+            if is_sync and mid in existing:
+                # Sync existujici — jen menitelna pole
+                fields = extract_sync_fields(msg, folder["path"])
+                batch.append(UpdateOne({"_id": mid}, {"$set": fields}))
+                sync_count += 1
+                status = "SYN "
+                print(f"  {total_i:>6}  {status}  {mid[:80]}")
+            else:
+                # Full extract (new-only nove, sync nove, full vse)
+                # Pro sync nove zpravy potrebujeme plny fetch
+                if is_sync:
+                    full_url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/messages/{msg['id']}"
+                    full_params = {"$select": MSG_SELECT, "$expand": ATT_EXPAND}
+                    try:
+                        msg = graph_get(full_url, full_params)
+                    except Exception as e:
+                        logging.error("full fetch failed [%s]: %s", msg.get("id","?"), e)
+                        err_count += 1
+                        continue
+
+                doc = extract_message(msg, folder["path"])
+                if doc is None:
+                    err_count += 1
+                    status = "ERR "
+                    print(f"  {total_i:>6}  {status}  {mid[:80]}")
+                else:
+                    batch.append(UpdateOne({"_id": doc["_id"]}, {"$set": doc}, upsert=True))
+                    ok_count += 1
+                    status = "OK  "
+                    subject_str = (doc.get("subject") or "")[:60]
+                    sender_str  = (doc.get("sender", {}).get("email") or "")[:40]
+                    print(f"  {total_i:>6}  {status}  {subject_str:<60}  {sender_str}")
+
+            if len(batch) >= BATCH_SIZE:
+                flush()
+
+            if total_i % 500 == 0:
+                elapsed = (datetime.now() - start).total_seconds()
+                rate    = total_i / elapsed if elapsed > 0 else 0
+                print(f"  {'─'*80}")
+                print(f"  Průběh: ok={ok_count}  sync={sync_count}  skip={skip_count}  err={err_count}  {rate:.1f} msg/s")
+                print(f"  {'─'*80}")
+
+        flush()
+        print(f"  → {folder_count} zprav ze slozky {folder['path']}")
+
+        if args.limit and total_i >= args.limit:
+            break
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  ok={ok_count}  |  sync={sync_count}  |  skip={skip_count}  |  err={err_count}")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"Dokumentu v kolekci: {col.count_documents({})}")
+
+    if not args.no_indexes:
+        print()
+        create_indexes(col)
+
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,449 @@
+"""
+download_attachments_v1.0.py
+Nazev:  download_attachments_v1.0.py
+Verze:  1.0
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Stahuje skutecne prilohy (is_inline=False) vsech emailu z MongoDB kolekce
+    ordinace@buzalkova.cz primo pres Microsoft Graph API a uklada je do
+    adresare /mnt/Emails/ordinace@buzalkova.cz/Attachments/.
+
+    Deduplikace podle SHA256 hashe obsahu:
+        - stejny hash = soubor uz existuje -> preskoci
+        - prvni vyskytu souboru: ulozi pod puvodnimnazvem
+        - kolize nazvu (stejny nazev, jiny hash): faktura_2.pdf, faktura_3.pdf ...
+
+    Po ulozeni aktualizuje MongoDB:
+        - v email dokumentu: kazda priloha dostane file_hash + local_path
+        - kolekce emaily.attachments_index: _id=hash, filename, path, size_bytes,
+          mime_type, first_seen_at, ref_count (pocet emailu ktery ji obsahuje)
+
+    Bezpecne prerusit a opakovat:
+        - zpravy kde jsou vsechny prilohy uz stazene (maji file_hash) se preskoci
+        - --force-recheck znovu overi i uz stazene (pro pripad zmen na disku)
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python download_attachments_v1.0.py               # stahni vse co chybi
+    python download_attachments_v1.0.py --limit 50    # test na prvnich 50 emailech
+    python download_attachments_v1.0.py --force-recheck  # overi i uz stazene
+
+Docker (po pridani mountu /mnt/user/Emails -> /mnt/Emails):
+    docker exec -it python-runner python /scripts/download_attachments_v1.0.py
+
+Zavislosti:
+    msal, requests, pymongo, python-dateutil
+    Python 3.10+
+
+Struktura na disku:
+    /mnt/Emails/
+    └── ordinace@buzalkova.cz/
+        └── Attachments/
+            ├── faktura_2026.pdf
+            ├── vysledky_lab.pdf
+            ├── vysledky_lab_2.pdf   <- kolize nazvu, jiny obsah
+            └── ...
+
+Kolekce emaily.attachments_index:
+    _id          SHA256 hash (hex)
+    filename     nazev souboru na disku (prvni vyskytu)
+    local_path   relativni cesta od Attachments/ (zatim = filename)
+    size_bytes   velikost souboru
+    mime_type    MIME typ
+    first_seen_at  datetime UTC
+    ref_count    v kolika emailech se tato priloha vyskytuje
+
+Aktualizace v email dokumentu (kolekce ordinace@buzalkova.cz):
+    attachments[i].file_hash    SHA256 hash
+    attachments[i].local_path   cesta relativni od Attachments/
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+"""
+
+import sys
+import hashlib
+import logging
+import argparse
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from pymongo import MongoClient, UpdateOne
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_MAILBOX       = "ordinace@buzalkova.cz"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI           = "mongodb://192.168.1.76:27017"
+MONGO_DB            = "emaily"
+MONGO_COL_EMAILS    = "ordinace@buzalkova.cz"
+MONGO_COL_INDEX     = "attachments_index"
+
+ATTACHMENTS_DIR     = Path("/mnt/Emails/ordinace@buzalkova.cz/Attachments")
+LOG_FILE            = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION      = "1.0"
+BATCH_SIZE          = 50
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+_graph_token: Optional[str] = None
+
+
+# ─── Graph API ────────────────────────────────────────────────────────────────
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get_bytes(url: str) -> bytes:
+    """Stahne binarni obsah prilohy."""
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, timeout=120, stream=True)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.content
+    raise RuntimeError(f"Graph GET bytes failed: {url}")
+
+
+def graph_get_json(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET json failed: {url}")
+
+
+def fetch_attachment_content(graph_message_id: str, attachment_id: str) -> Optional[bytes]:
+    """Stahne obsah prilohy pres Graph API."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/messages/{graph_message_id}/attachments/{attachment_id}/$value"
+    try:
+        return graph_get_bytes(url)
+    except Exception as e:
+        logging.error("fetch_attachment_content failed [msg=%s att=%s]: %s", graph_message_id, attachment_id, e)
+        return None
+
+
+def fetch_message_attachments(graph_message_id: str) -> list[dict]:
+    """Nacte seznam priloh zpravy z Graph API (metadata vcetne attachment ID)."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/messages/{graph_message_id}/attachments"
+    try:
+        data = graph_get_json(url, {"$select": "id,name,contentType,size,isInline,contentId"})
+        return data.get("value", [])
+    except Exception as e:
+        logging.error("fetch_message_attachments failed [%s]: %s", graph_message_id, e)
+        return []
+
+
+# ─── Dedup + ukládání ─────────────────────────────────────────────────────────
+
+def sha256(data: bytes) -> str:
+    return hashlib.sha256(data).hexdigest()
+
+
+def resolve_filename(desired_name: str, att_dir: Path, hash_val: str, index_col) -> str:
+    """
+    Vrati nazev souboru ktery pouzit pro ulozeni.
+    Pokud desired_name jiz existuje s jinym hashem, prida suffix _2, _3 ...
+    """
+    # Zkontroluj jestli existujici soubor se stejnym nazvem ma stejny hash
+    existing = index_col.find_one({"filename": desired_name})
+    if existing:
+        if existing["_id"] == hash_val:
+            return desired_name  # Stejny hash, stejne jmeno — dedup hit
+        # Jiny hash — hledej volny suffix
+        stem   = Path(desired_name).stem
+        suffix = Path(desired_name).suffix
+        n = 2
+        while True:
+            candidate = f"{stem}_{n}{suffix}"
+            if not (att_dir / candidate).exists():
+                # Overi ze ani v indexu neni tento kandidat s jinym hashem
+                ex2 = index_col.find_one({"filename": candidate})
+                if not ex2 or ex2["_id"] == hash_val:
+                    return candidate
+            n += 1
+    return desired_name
+
+
+def save_attachment(content: bytes, original_name: str, att_dir: Path, index_col) -> tuple[str, str, bool]:
+    """
+    Ulozi prilohu s deduplikaci.
+    Vraci (hash, local_path, was_new):
+        was_new=True  -> soubor byl ulozen
+        was_new=False -> hash uz existoval, soubor preskocen
+    """
+    hash_val = sha256(content)
+
+    # Zkontroluj index — pokud hash uz existuje, vrat existujici zaznam
+    existing = index_col.find_one({"_id": hash_val})
+    if existing:
+        # Zvys pocitadlo referenci
+        index_col.update_one({"_id": hash_val}, {"$inc": {"ref_count": 1}})
+        return hash_val, existing["local_path"], False
+
+    # Novy soubor — urcit nazev
+    safe_name = "".join(c if c.isalnum() or c in "._- " else "_" for c in original_name).strip()
+    if not safe_name:
+        safe_name = f"attachment_{hash_val[:8]}"
+
+    filename  = resolve_filename(safe_name, att_dir, hash_val, index_col)
+    file_path = att_dir / filename
+
+    # Uloz soubor
+    file_path.write_bytes(content)
+
+    # Zaznamenej do indexu
+    index_col.insert_one({
+        "_id":          hash_val,
+        "filename":     filename,
+        "local_path":   filename,
+        "size_bytes":   len(content),
+        "mime_type":    "",
+        "first_seen_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        "ref_count":    1,
+    })
+
+    return hash_val, filename, True
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"download_attachments v{SCRIPT_VERSION}")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N emailu (0 = vse)")
+    ap.add_argument("--force-recheck", action="store_true",
+                    help="Znovu overi i emaily kde prilohy uz maji file_hash")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    start = datetime.now()
+    print(f"=== download_attachments v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {GRAPH_MAILBOX}")
+    print(f"Cilovy adresar: {ATTACHMENTS_DIR}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}")
+
+    # Adresar
+    ATTACHMENTS_DIR.mkdir(parents=True, exist_ok=True)
+    print(f"  Adresar OK")
+
+    # Graph
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    # MongoDB
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+
+    col_emails = client[MONGO_DB][MONGO_COL_EMAILS]
+    col_index  = client[MONGO_DB][MONGO_COL_INDEX]
+
+    # Indexy na attachment index kolekci
+    if not args.no_indexes:
+        col_index.create_index("filename")
+        col_index.create_index("mime_type")
+
+    # Dotaz — emaily s prilohou ktere jeste nebyly zpracovany
+    if args.force_recheck:
+        query = {"has_attachments": True}
+    else:
+        query = {
+            "has_attachments": True,
+            "attachments": {
+                "$elemMatch": {
+                    "is_inline": False,
+                    "file_hash":  {"$exists": False},
+                }
+            }
+        }
+
+    total = col_emails.count_documents(query)
+    print(f"\nEmailu ke zpracovani: {total}")
+    if total == 0:
+        print("Neni co stahnout.")
+        client.close()
+        return
+
+    cursor = col_emails.find(query, {"_id": 1, "graph_id": 1, "subject": 1, "attachments": 1})
+    if args.limit:
+        cursor = cursor.limit(args.limit)
+
+    ok_count   = 0
+    new_count  = 0
+    skip_count = 0
+    err_count  = 0
+    email_i    = 0
+    batch      = []
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col_emails.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    for email_doc in cursor:
+        email_i += 1
+        email_id   = email_doc["_id"]
+        graph_id   = email_doc.get("graph_id", "")
+        subject    = (email_doc.get("subject") or "")[:60]
+        att_list   = email_doc.get("attachments") or []
+
+        # Jen skutecne prilohy
+        real_atts = [a for a in att_list if not a.get("is_inline", False)]
+        if not real_atts:
+            continue
+
+        print(f"\n  {email_i:>5}/{total}  {subject}")
+
+        # Nacti attachment IDs z Graph API
+        graph_atts = fetch_message_attachments(graph_id)
+        graph_att_map = {a["name"]: a for a in graph_atts if not a.get("isInline", False)}
+
+        updated_atts = list(att_list)
+        email_ok = True
+
+        for i, att in enumerate(updated_atts):
+            if att.get("is_inline", False):
+                continue
+            if not args.force_recheck and att.get("file_hash"):
+                skip_count += 1
+                print(f"         SKIP  {att['filename']}")
+                continue
+
+            att_name    = att.get("filename", "")
+            graph_att   = graph_att_map.get(att_name)
+
+            if not graph_att:
+                # Zkus najit podle casti nazvu
+                for gname, ga in graph_att_map.items():
+                    if att_name.lower() in gname.lower():
+                        graph_att = ga
+                        break
+
+            if not graph_att:
+                logging.error("attachment not found in Graph [email=%s att=%s]", email_id, att_name)
+                print(f"         ERR   {att_name} (nenalezeno v Graph)")
+                err_count += 1
+                email_ok = False
+                continue
+
+            # Stahni obsah
+            content = fetch_attachment_content(graph_id, graph_att["id"])
+            if content is None:
+                err_count += 1
+                email_ok = False
+                print(f"         ERR   {att_name} (stazeni selhalo)")
+                continue
+
+            # Uloz s dedupem
+            hash_val, local_path, was_new = save_attachment(content, att_name, ATTACHMENTS_DIR, col_index)
+
+            # Aktualizuj MIME typ v indexu
+            col_index.update_one(
+                {"_id": hash_val},
+                {"$set": {"mime_type": att.get("mime_type", graph_att.get("contentType", ""))}},
+            )
+
+            # Zaznamenej do emailu
+            updated_atts[i] = {**att, "file_hash": hash_val, "local_path": local_path}
+
+            if was_new:
+                new_count += 1
+                print(f"         NEW   {local_path}  ({len(content):,} B)")
+            else:
+                skip_count += 1
+                print(f"         DUP   {att_name} -> {local_path}")
+
+        if email_ok:
+            ok_count += 1
+
+        # Uloz aktualizovane prilohy zpet do emailu
+        batch.append(UpdateOne(
+            {"_id": email_id},
+            {"$set": {"attachments": updated_atts}}
+        ))
+
+        if len(batch) >= BATCH_SIZE:
+            flush()
+
+        if email_i % 100 == 0:
+            elapsed = (datetime.now() - start).total_seconds()
+            print(f"  {'─'*60}")
+            print(f"  Průběh: emaily={email_i}/{total}  nove={new_count}  dup={skip_count}  err={err_count}")
+            print(f"  {'─'*60}")
+
+    flush()
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    files_total   = col_index.count_documents({})
+    size_total    = sum(d.get("size_bytes", 0) for d in col_index.find({}, {"size_bytes": 1}))
+
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  emaily={ok_count}  |  nove soubory={new_count}  |  duplikaty={skip_count}  |  err={err_count}")
+    print(f"Souboru v indexu: {files_total}  ({size_total/1024/1024:.1f} MB)")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,428 @@
+"""
+download_attachments_v1.1.py
+Nazev:  download_attachments_v1.1.py
+Verze:  1.1
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Stahuje skutecne prilohy (is_inline=False) vsech emailu z MongoDB
+    pres Microsoft Graph API a uklada je do adresare
+    /mnt/Emails/<schránka>/Attachments/.
+
+    Schránka se predava jako povinny parametr --mailbox.
+
+    Deduplikace podle SHA256 hashe obsahu:
+        - stejny hash = soubor uz existuje -> preskoci
+        - prvni vyskytu souboru: ulozi pod puvodnimnazvem
+        - kolize nazvu (stejny nazev, jiny hash): faktura_2.pdf, faktura_3.pdf ...
+
+    Po ulozeni aktualizuje MongoDB:
+        - v email dokumentu: kazda priloha dostane file_hash + local_path
+        - kolekce emaily.attachments_index: _id=hash, filename, path, size_bytes,
+          mime_type, mailbox, first_seen_at, ref_count
+
+    Bezpecne prerusit a opakovat — emaily kde vsechny prilohy maji file_hash
+    se preskoci. --force-recheck znovu overi i uz stazene.
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python download_attachments_v1.1.py --mailbox ordinace@buzalkova.cz
+    python download_attachments_v1.1.py --mailbox vladimir.buzalka@buzalka.cz --limit 50
+    python download_attachments_v1.1.py --mailbox ordinace@buzalkova.cz --force-recheck
+
+Docker:
+    docker exec -it python-runner python /scripts/download_attachments_v1.1.py \\
+        --mailbox ordinace@buzalkova.cz
+
+Zavislosti:
+    msal, requests, pymongo
+    Python 3.10+
+
+Struktura na disku:
+    /mnt/Emails/
+    └── <mailbox>/
+        └── Attachments/
+            ├── faktura_2026.pdf
+            ├── vysledky_lab.pdf
+            ├── vysledky_lab_2.pdf
+            └── ...
+
+Kolekce emaily.attachments_index:
+    _id            SHA256 hash (hex)
+    filename       nazev souboru na disku
+    local_path     relativni cesta od Attachments/
+    size_bytes     velikost souboru
+    mime_type      MIME typ
+    mailbox        schránka ze ktere pochazi prvni vyskytu
+    first_seen_at  datetime UTC
+    ref_count      v kolika emailech se tato priloha vyskytuje
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+    1.1  2026-06-02  Schránka jako parametr --mailbox (univerzalni pouziti)
+"""
+
+import sys
+import hashlib
+import logging
+import argparse
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from pymongo import MongoClient, UpdateOne
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI           = "mongodb://192.168.1.76:27017"
+MONGO_DB            = "emaily"
+MONGO_COL_INDEX     = "attachments_index"
+
+EMAILS_BASE_DIR     = Path("/mnt/Emails")
+LOG_FILE            = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION      = "1.1"
+BATCH_SIZE          = 50
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+_graph_token: Optional[str] = None
+
+
+# ─── Graph API ────────────────────────────────────────────────────────────────
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get_bytes(url: str) -> bytes:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, timeout=120, stream=True)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.content
+    raise RuntimeError(f"Graph GET bytes failed: {url}")
+
+
+def graph_get_json(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET json failed: {url}")
+
+
+def fetch_message_attachments(mailbox: str, graph_message_id: str) -> list[dict]:
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments"
+    try:
+        data = graph_get_json(url, {"$select": "id,name,contentType,size,isInline,contentId"})
+        return data.get("value", [])
+    except Exception as e:
+        logging.error("fetch_message_attachments failed [%s]: %s", graph_message_id, e)
+        return []
+
+
+def fetch_attachment_content(mailbox: str, graph_message_id: str, attachment_id: str) -> Optional[bytes]:
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments/{attachment_id}/$value"
+    try:
+        return graph_get_bytes(url)
+    except Exception as e:
+        logging.error("fetch_attachment_content failed [msg=%s att=%s]: %s", graph_message_id, attachment_id, e)
+        return None
+
+
+# ─── Dedup + ukládání ─────────────────────────────────────────────────────────
+
+def sha256(data: bytes) -> str:
+    return hashlib.sha256(data).hexdigest()
+
+
+def safe_filename(name: str) -> str:
+    safe = "".join(c if c.isalnum() or c in "._- " else "_" for c in name).strip()
+    return safe or "attachment"
+
+
+def resolve_filename(desired_name: str, att_dir: Path, hash_val: str, col_index) -> str:
+    """Vrati nazev souboru pro ulozeni — resi kolize (stejny nazev, jiny hash)."""
+    existing = col_index.find_one({"filename": desired_name})
+    if existing:
+        if existing["_id"] == hash_val:
+            return desired_name  # Dedup hit — stejny hash
+        # Kolize — hledej volny suffix
+        stem   = Path(desired_name).stem
+        suffix = Path(desired_name).suffix
+        n = 2
+        while True:
+            candidate = f"{stem}_{n}{suffix}"
+            ex2 = col_index.find_one({"filename": candidate})
+            if not ex2 or ex2["_id"] == hash_val:
+                if not (att_dir / candidate).exists() or (ex2 and ex2["_id"] == hash_val):
+                    return candidate
+            n += 1
+    return desired_name
+
+
+def save_attachment(
+    content: bytes,
+    original_name: str,
+    mime_type: str,
+    mailbox: str,
+    att_dir: Path,
+    col_index,
+) -> tuple[str, str, bool]:
+    """
+    Ulozi prilohu s deduplikaci.
+    Vraci (hash, local_path, was_new).
+    """
+    hash_val = sha256(content)
+
+    existing = col_index.find_one({"_id": hash_val})
+    if existing:
+        col_index.update_one({"_id": hash_val}, {"$inc": {"ref_count": 1}})
+        return hash_val, existing["local_path"], False
+
+    filename  = resolve_filename(safe_filename(original_name), att_dir, hash_val, col_index)
+    file_path = att_dir / filename
+    file_path.write_bytes(content)
+
+    col_index.insert_one({
+        "_id":          hash_val,
+        "filename":     filename,
+        "local_path":   filename,
+        "size_bytes":   len(content),
+        "mime_type":    mime_type,
+        "mailbox":      mailbox,
+        "first_seen_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        "ref_count":    1,
+    })
+
+    return hash_val, filename, True
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"download_attachments v{SCRIPT_VERSION}")
+    ap.add_argument("--mailbox",       required=True,
+                    help="Emailova schranka (napr. ordinace@buzalkova.cz)")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N emailu (0 = vse)")
+    ap.add_argument("--force-recheck", action="store_true",
+                    help="Znovu overi i emaily kde prilohy uz maji file_hash")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na attachments_index kolekci")
+    args = ap.parse_args()
+
+    mailbox     = args.mailbox
+    att_dir     = EMAILS_BASE_DIR / mailbox / "Attachments"
+    mongo_col   = mailbox
+
+    start = datetime.now()
+    print(f"=== download_attachments v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {mailbox}")
+    print(f"Cilovy adresar: {att_dir}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{mongo_col}")
+
+    att_dir.mkdir(parents=True, exist_ok=True)
+    print("  Adresar OK")
+
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+
+    col_emails = client[MONGO_DB][mongo_col]
+    col_index  = client[MONGO_DB][MONGO_COL_INDEX]
+
+    if not args.no_indexes:
+        col_index.create_index("filename")
+        col_index.create_index("mime_type")
+        col_index.create_index("mailbox")
+
+    # Dotaz
+    if args.force_recheck:
+        query = {"has_attachments": True}
+    else:
+        query = {
+            "has_attachments": True,
+            "attachments": {
+                "$elemMatch": {
+                    "is_inline": False,
+                    "file_hash": {"$exists": False},
+                }
+            }
+        }
+
+    total = col_emails.count_documents(query)
+    print(f"\nEmailu ke zpracovani: {total}")
+    if total == 0:
+        print("Neni co stahnout.")
+        client.close()
+        return
+
+    cursor = col_emails.find(query, {"_id": 1, "graph_id": 1, "subject": 1, "attachments": 1})
+    if args.limit:
+        cursor = cursor.limit(args.limit)
+
+    ok_count   = 0
+    new_count  = 0
+    dup_count  = 0
+    err_count  = 0
+    email_i    = 0
+    batch      = []
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col_emails.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    for email_doc in cursor:
+        email_i   += 1
+        email_id   = email_doc["_id"]
+        graph_id   = email_doc.get("graph_id", "")
+        subject    = (email_doc.get("subject") or "")[:60]
+        att_list   = email_doc.get("attachments") or []
+
+        real_atts = [a for a in att_list if not a.get("is_inline", False)]
+        if not real_atts:
+            continue
+
+        print(f"\n  {email_i:>5}/{total}  {subject}")
+
+        graph_atts    = fetch_message_attachments(mailbox, graph_id)
+        graph_att_map = {a["name"]: a for a in graph_atts if not a.get("isInline", False)}
+
+        updated_atts = list(att_list)
+        email_ok     = True
+
+        for i, att in enumerate(updated_atts):
+            if att.get("is_inline", False):
+                continue
+            if not args.force_recheck and att.get("file_hash"):
+                print(f"         SKIP  {att['filename']}")
+                continue
+
+            att_name  = att.get("filename", "")
+            graph_att = graph_att_map.get(att_name)
+            if not graph_att:
+                for gname, ga in graph_att_map.items():
+                    if att_name.lower() in gname.lower():
+                        graph_att = ga
+                        break
+
+            if not graph_att:
+                logging.error("attachment not found in Graph [email=%s att=%s]", email_id, att_name)
+                print(f"         ERR   {att_name} (nenalezeno v Graph)")
+                err_count += 1
+                email_ok = False
+                continue
+
+            content = fetch_attachment_content(mailbox, graph_id, graph_att["id"])
+            if content is None:
+                err_count += 1
+                email_ok = False
+                print(f"         ERR   {att_name} (stazeni selhalo)")
+                continue
+
+            mime_type = att.get("mime_type") or graph_att.get("contentType", "")
+            hash_val, local_path, was_new = save_attachment(
+                content, att_name, mime_type, mailbox, att_dir, col_index
+            )
+
+            updated_atts[i] = {**att, "file_hash": hash_val, "local_path": local_path}
+
+            if was_new:
+                new_count += 1
+                print(f"         NEW   {local_path}  ({len(content):,} B)")
+            else:
+                dup_count += 1
+                print(f"         DUP   {att_name} -> {local_path}")
+
+        if email_ok:
+            ok_count += 1
+
+        batch.append(UpdateOne({"_id": email_id}, {"$set": {"attachments": updated_atts}}))
+
+        if len(batch) >= BATCH_SIZE:
+            flush()
+
+        if email_i % 100 == 0:
+            elapsed = (datetime.now() - start).total_seconds()
+            print(f"  {'─'*60}")
+            print(f"  Průběh: emaily={email_i}/{total}  nove={new_count}  dup={dup_count}  err={err_count}")
+            print(f"  {'─'*60}")
+
+    flush()
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    files_total   = col_index.count_documents({})
+    size_total    = sum(d.get("size_bytes", 0) for d in col_index.find({}, {"size_bytes": 1}))
+
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  emaily={ok_count}  |  nove={new_count}  |  dup={dup_count}  |  err={err_count}")
+    print(f"Souboru v indexu: {files_total}  ({size_total / 1024 / 1024:.1f} MB)")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,483 @@
+"""
+download_attachments_v1.3.py
+Nazev:  download_attachments_v1.3.py
+Verze:  1.3
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Stahuje skutecne prilohy (is_inline=False) vsech emailu z MongoDB
+    pres Microsoft Graph API a uklada je do adresare
+    /mnt/Emails/<schránka>/Attachments/.
+
+    Schránka se predava jako povinny parametr --mailbox.
+
+    Deduplikace podle SHA256 hashe obsahu:
+        - stejny hash = soubor uz existuje -> preskoci
+        - prvni vyskytu souboru: ulozi pod puvodnimnazvem
+        - kolize nazvu (stejny nazev, jiny hash): faktura_2.pdf, faktura_3.pdf ...
+
+    Po ulozeni aktualizuje MongoDB:
+        - v email dokumentu: kazda priloha dostane file_hash + local_path
+        - kolekce emaily.attachments_index: _id=hash, filename, path, size_bytes,
+          mime_type, mailbox, first_seen_at, ref_count
+
+    Bezpecne prerusit a opakovat — emaily kde vsechny prilohy maji file_hash
+    se preskoci. --force-recheck znovu overi i uz stazene.
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python download_attachments_v1.3.py --mailbox ordinace@buzalkova.cz
+    python download_attachments_v1.3.py --mailbox ordinace@buzalkova.cz --limit 50
+    python download_attachments_v1.3.py --mailbox ordinace@buzalkova.cz --force-recheck
+
+Docker:
+    docker exec -it python-runner python /scripts/download_attachments_v1.3.py \\
+        --mailbox ordinace@buzalkova.cz
+
+Zavislosti:
+    msal, requests, pymongo
+    Python 3.10+
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+    1.1  2026-06-02  Schránka jako parametr --mailbox
+    1.2  2026-06-02  Oprava: Graph attachment mapa vcetne inline; normalizace nazvu;
+                     preskoceni S/MIME; inline z Graphu -> SKIP ne ERR
+    1.3  2026-06-02  Primarni stazeni pres graph_att_id (prime ID bez name-matchingu);
+                     oprava $select na attachment listu (odstranen contentId ktery
+                     zpusoboval BadRequest a vracel prazdny seznam); name-matching
+                     zustava jako fallback pro stare emaily bez graph_att_id
+"""
+
+import sys
+import re
+import hashlib
+import logging
+import argparse
+import unicodedata
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from pymongo import MongoClient, UpdateOne
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI           = "mongodb://192.168.1.76:27017"
+MONGO_DB            = "emaily"
+MONGO_COL_INDEX     = "attachments_index"
+
+EMAILS_BASE_DIR     = Path("/mnt/Emails")
+LOG_FILE            = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION      = "1.3"
+BATCH_SIZE          = 50
+
+# Typy příloh které přeskočíme (S/MIME podpisy, certifikáty)
+SKIP_EXTENSIONS = {".p7m", ".p7s", ".p7c", ".p7b"}
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+_graph_token: Optional[str] = None
+
+
+# ─── Graph API ────────────────────────────────────────────────────────────────
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get_bytes(url: str) -> bytes:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, timeout=120, stream=True)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.content
+    raise RuntimeError(f"Graph GET bytes failed: {url}")
+
+
+def graph_get_json(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET json failed: {url}")
+
+
+def fetch_message_attachments(mailbox: str, graph_message_id: str) -> list[dict]:
+    """Nacte metadata vsech priloh zpravy (bez contentBytes)."""
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments"
+    try:
+        # Pozor: contentId NENI v base attachment type — nesmi byt v $select
+        data = graph_get_json(url, {"$select": "id,name,contentType,size,isInline"})
+        return data.get("value", [])
+    except Exception as e:
+        logging.error("fetch_message_attachments failed [%s]: %s", graph_message_id, e)
+        return []
+
+
+def fetch_attachment_content(mailbox: str, graph_message_id: str, attachment_id: str) -> Optional[bytes]:
+    url = f"{GRAPH_URL}/users/{mailbox}/messages/{graph_message_id}/attachments/{attachment_id}/$value"
+    try:
+        return graph_get_bytes(url)
+    except Exception as e:
+        logging.error("fetch_attachment_content failed [msg=%s att=%s]: %s",
+                      graph_message_id, attachment_id, e)
+        return None
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def normalize_name(name: str) -> str:
+    """Normalizuje název pro porovnání — lowercase, bez diakritiky, jen alnum+._-"""
+    nfkd = unicodedata.normalize("NFKD", name.lower().strip())
+    ascii_str = "".join(c for c in nfkd if not unicodedata.combining(c))
+    return re.sub(r"[^\w.\-]", "_", ascii_str)
+
+
+def find_graph_att(att_name: str, att_size: int, graph_atts: list[dict]) -> Optional[dict]:
+    """Fallback: hleda prilohu v Graph listu podle jmena (pro emaily bez graph_att_id)."""
+    # 1. Presna shoda
+    for ga in graph_atts:
+        if ga["name"] == att_name:
+            return ga
+
+    norm_want = normalize_name(att_name)
+
+    # 2. Normalizovana shoda
+    for ga in graph_atts:
+        if normalize_name(ga["name"]) == norm_want:
+            return ga
+
+    # 3. Normalizovana shoda + velikost (±10 %)
+    for ga in graph_atts:
+        if normalize_name(ga["name"]) == norm_want:
+            ga_size = ga.get("size", 0)
+            if att_size == 0 or ga_size == 0 or abs(ga_size - att_size) / max(ga_size, att_size) < 0.1:
+                return ga
+
+    # 4. Castecna shoda sufixu (posledních 20 znaků normalizovaného jména)
+    for ga in graph_atts:
+        if norm_want[-20:] and normalize_name(ga["name"]).endswith(norm_want[-20:]):
+            return ga
+
+    return None
+
+
+def sha256(data: bytes) -> str:
+    return hashlib.sha256(data).hexdigest()
+
+
+def safe_filename(name: str) -> str:
+    safe = "".join(c if c.isalnum() or c in "._- ()" else "_" for c in name).strip()
+    return safe or "attachment"
+
+
+def resolve_filename(desired_name: str, att_dir: Path, hash_val: str, col_index) -> str:
+    existing = col_index.find_one({"filename": desired_name})
+    if existing:
+        if existing["_id"] == hash_val:
+            return desired_name
+        stem   = Path(desired_name).stem
+        suffix = Path(desired_name).suffix
+        n = 2
+        while True:
+            candidate = f"{stem}_{n}{suffix}"
+            ex2 = col_index.find_one({"filename": candidate})
+            if not ex2 or ex2["_id"] == hash_val:
+                if not (att_dir / candidate).exists() or (ex2 and ex2["_id"] == hash_val):
+                    return candidate
+            n += 1
+    return desired_name
+
+
+def save_attachment(
+    content: bytes,
+    original_name: str,
+    mime_type: str,
+    mailbox: str,
+    att_dir: Path,
+    col_index,
+) -> tuple[str, str, bool]:
+    hash_val = sha256(content)
+
+    existing = col_index.find_one({"_id": hash_val})
+    if existing:
+        col_index.update_one({"_id": hash_val}, {"$inc": {"ref_count": 1}})
+        return hash_val, existing["local_path"], False
+
+    filename  = resolve_filename(safe_filename(original_name), att_dir, hash_val, col_index)
+    file_path = att_dir / filename
+    file_path.write_bytes(content)
+
+    col_index.insert_one({
+        "_id":           hash_val,
+        "filename":      filename,
+        "local_path":    filename,
+        "size_bytes":    len(content),
+        "mime_type":     mime_type,
+        "mailbox":       mailbox,
+        "first_seen_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        "ref_count":     1,
+    })
+
+    return hash_val, filename, True
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"download_attachments v{SCRIPT_VERSION}")
+    ap.add_argument("--mailbox",       required=True,
+                    help="Emailova schranka (napr. ordinace@buzalkova.cz)")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N emailu (0 = vse)")
+    ap.add_argument("--force-recheck", action="store_true",
+                    help="Znovu overi i emaily kde prilohy uz maji file_hash")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na attachments_index kolekci")
+    args = ap.parse_args()
+
+    mailbox   = args.mailbox
+    att_dir   = EMAILS_BASE_DIR / mailbox / "Attachments"
+    mongo_col = mailbox
+
+    start = datetime.now()
+    print(f"=== download_attachments v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {mailbox}")
+    print(f"Cilovy adresar: {att_dir}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{mongo_col}")
+
+    att_dir.mkdir(parents=True, exist_ok=True)
+    print("  Adresar OK")
+
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+
+    col_emails = client[MONGO_DB][mongo_col]
+    col_index  = client[MONGO_DB][MONGO_COL_INDEX]
+
+    if not args.no_indexes:
+        col_index.create_index("filename")
+        col_index.create_index("mime_type")
+        col_index.create_index("mailbox")
+
+    if args.force_recheck:
+        query = {"has_attachments": True}
+    else:
+        query = {
+            "has_attachments": True,
+            "attachments": {
+                "$elemMatch": {
+                    "is_inline": False,
+                    "file_hash": {"$exists": False},
+                }
+            }
+        }
+
+    total = col_emails.count_documents(query)
+    print(f"\nEmailu ke zpracovani: {total}")
+    if total == 0:
+        print("Neni co stahnout.")
+        client.close()
+        return
+
+    cursor = col_emails.find(query, {"_id": 1, "graph_id": 1, "subject": 1, "attachments": 1})
+    if args.limit:
+        cursor = cursor.limit(args.limit)
+
+    ok_count   = 0
+    new_count  = 0
+    dup_count  = 0
+    skip_count = 0
+    err_count  = 0
+    email_i    = 0
+    batch      = []
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col_emails.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    for email_doc in cursor:
+        email_i  += 1
+        email_id  = email_doc["_id"]
+        graph_id  = email_doc.get("graph_id", "")
+        subject   = (email_doc.get("subject") or "")[:60]
+        att_list  = email_doc.get("attachments") or []
+
+        real_atts = [a for a in att_list if not a.get("is_inline", False)]
+        if not real_atts:
+            continue
+
+        print(f"\n  {email_i:>5}/{total}  {subject}")
+
+        # Nacti attachment list z Graphu jen pokud nektere prilohy nemaji graph_att_id
+        need_listing = any(
+            not a.get("is_inline", False)
+            and not (not args.force_recheck and a.get("file_hash"))
+            and not a.get("graph_att_id")
+            for a in att_list
+        )
+        graph_atts = fetch_message_attachments(mailbox, graph_id) if need_listing else []
+
+        updated_atts = list(att_list)
+        email_ok     = True
+
+        for i, att in enumerate(updated_atts):
+            if att.get("is_inline", False):
+                continue
+            if not args.force_recheck and att.get("file_hash"):
+                continue
+
+            att_name     = att.get("filename", "")
+            att_size     = att.get("size_bytes", 0)
+            graph_att_id = att.get("graph_att_id")
+
+            # Preskoc S/MIME podpisy
+            if Path(att_name).suffix.lower() in SKIP_EXTENSIONS:
+                updated_atts[i] = {**att, "file_hash": "skip", "local_path": ""}
+                skip_count += 1
+                print(f"         SKIP  {att_name} (S/MIME)")
+                continue
+
+            # Primy pristup pres graph_att_id (emaily parsovane v1.2+)
+            if graph_att_id:
+                content = fetch_attachment_content(mailbox, graph_id, graph_att_id)
+                if content is None:
+                    err_count += 1
+                    email_ok = False
+                    print(f"         ERR   {att_name} (stazeni selhalo)")
+                    continue
+                # Zkontroluj zda jde skutecne o inline (pro edge case)
+                mime_type = att.get("mime_type", "")
+            else:
+                # Fallback: name matching pro stare emaily (parsovane pred v1.2)
+                graph_att = find_graph_att(att_name, att_size, graph_atts)
+
+                if not graph_att:
+                    logging.error("attachment not found [email=%s att=%s]", email_id, att_name)
+                    print(f"         ERR   {att_name} (nenalezeno)")
+                    err_count += 1
+                    email_ok = False
+                    continue
+
+                # Pokud Graph rika ze je inline — preskoc
+                if graph_att.get("isInline", False):
+                    updated_atts[i] = {**att, "is_inline": True, "file_hash": "skip", "local_path": ""}
+                    skip_count += 1
+                    print(f"         SKIP  {att_name} (inline obrazek)")
+                    continue
+
+                content = fetch_attachment_content(mailbox, graph_id, graph_att["id"])
+                if content is None:
+                    err_count += 1
+                    email_ok = False
+                    print(f"         ERR   {att_name} (stazeni selhalo)")
+                    continue
+
+                mime_type = att.get("mime_type") or graph_att.get("contentType", "")
+
+            hash_val, local_path, was_new = save_attachment(
+                content, att_name, mime_type, mailbox, att_dir, col_index
+            )
+
+            updated_atts[i] = {**att, "file_hash": hash_val, "local_path": local_path}
+
+            if was_new:
+                new_count += 1
+                print(f"         NEW   {local_path}  ({len(content):,} B)")
+            else:
+                dup_count += 1
+                print(f"         DUP   {att_name} -> {local_path}")
+
+        if email_ok:
+            ok_count += 1
+
+        batch.append(UpdateOne({"_id": email_id}, {"$set": {"attachments": updated_atts}}))
+
+        if len(batch) >= BATCH_SIZE:
+            flush()
+
+        if email_i % 100 == 0:
+            elapsed = (datetime.now() - start).total_seconds()
+            print(f"  {'─'*60}")
+            print(f"  Průběh: emaily={email_i}/{total}  nove={new_count}  dup={dup_count}  skip={skip_count}  err={err_count}")
+            print(f"  {'─'*60}")
+
+    flush()
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    files_total   = col_index.count_documents({})
+    size_total    = sum(d.get("size_bytes", 0) for d in col_index.find({}, {"size_bytes": 1}))
+
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  emaily={ok_count}  |  nove={new_count}  |  dup={dup_count}  |  skip={skip_count}  |  err={err_count}")
+    print(f"Souboru v indexu: {files_total}  ({size_total / 1024 / 1024:.1f} MB)")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,560 @@
+"""
+parse_emails_graph_v1.0.py
+Nazev:  parse_emails_graph_v1.0.py
+Verze:  1.0
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Cte vsechny emaily ze schranky ordinace@buzalkova.cz primo pres
+    Microsoft Graph API a importuje je jako dokumenty do MongoDB.
+    Ze kazde zpravy extrahuje vsechny dostupne vlastnosti:
+
+        - predmet, odesilatel, prijemci (To/CC/BCC s typy)
+        - cas doruceni, odeslani, vytvoreni, modifikace (UTC)
+        - telo HTML (max 2 MB) + textovy preview
+        - prilohy (metadata: jmeno, velikost, MIME typ, inline flag)
+        - internet headers (SPF, DKIM, Received, X-*, ...)
+        - MAPI-ekvivalenty: dulezitost, priznak, konverzacni vlakno,
+          kategorie, In-Reply-To, References, ...
+        - navic: isRead, isDraft, folder_path, inferenceClassification
+
+    Prochazi VSECHNY slozky schranky rekurzivne (Inbox, Sent, Deleted,
+    archivni slozky, ...).
+
+    DB:       emaily
+    Kolekce:  ordinace@buzalkova.cz
+    _id:      Internet Message-ID (nebo "graphid:<id>" jako fallback)
+
+    Bezpecne prerusit a opakovat:
+        - upsert podle _id — duplicity se automaticky prepisi
+        - --skip-existing nacte seznam hotovych _id z MongoDB a preskoci je
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    python parse_emails_graph_v1.0.py                    # kompletni import
+    python parse_emails_graph_v1.0.py --limit 50         # test na prvnich 50
+    python parse_emails_graph_v1.0.py --skip-existing    # pokracovani po preruseni
+    python parse_emails_graph_v1.0.py --folder Inbox     # jen jedna slozka
+    python parse_emails_graph_v1.0.py --no-indexes       # bez indexu na konci
+
+Zavislosti:
+    msal, requests, pymongo, python-dateutil
+    Python 3.10+
+
+Struktura dokumentu v MongoDB:
+    _id                     Internet Message-ID (nebo graphid: fallback)
+    graph_id                Graph API message ID (pro pripadne dalsi operace)
+    subject                 predmet zpravy
+    normalized_subject      predmet bez RE:/FW:/AW: prefixu
+    importance              0=nizka 1=normalni 2=vysoka
+    flag_status             0=bez priznaku 1=oznaceno 2=dokonceno
+    is_read                 bool — aktualni stav precteni ve schrance
+    is_draft                bool
+    has_attachments         bool
+    attachment_count        int
+    inference_classification focused / other (Outlook AI trideni)
+    categories              [str]
+    conversation_id         Graph conversationId
+    conversation_index      base64 conversationIndex
+    conversation_topic      tema vlakna (z internet headers Thread-Topic)
+    in_reply_to             Message-ID predchozi zpravy
+    internet_references     [Message-ID] — cela historia vlakna
+    received_at             datetime UTC
+    sent_at                 datetime UTC
+    created_at              datetime UTC — cas vytvoreni zaznamu v M365
+    modified_at             datetime UTC — cas posledni modifikace
+    folder_id               Graph parentFolderId
+    folder_path             cela cesta slozky (napr. Inbox/Subfolder)
+    sender.email            emailova adresa odesilatele
+    sender.name             zobrazovane jmeno odesilatele
+    to                      retezec To (joined)
+    cc                      retezec CC
+    bcc                     retezec BCC
+    recipients              [{type, email, name}] — to/cc/bcc s typy
+    body_html               HTML telo (max 2 MB)
+    body_preview            textovy nahled (max 255 znaku z Graph)
+    attachments             [{filename, size_bytes, mime_type,
+                              content_id, is_inline}]
+    headers                 dict internet headers (lowercase_s_podtrzitky)
+    parsed_at               datetime UTC — cas parsovani
+
+Indexy:
+    received_at, sent_at, sender.email, graph_id (unique),
+    conversation_id, folder_path, has_attachments, categories,
+    importance, flag_status, is_read,
+    text_search (subject + body_preview + to + cc)
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze — Graph API jako zdroj
+"""
+
+import sys
+import re
+import logging
+import argparse
+import base64
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from dateutil import parser as dtparser
+from pymongo import MongoClient, UpdateOne, ASCENDING, TEXT
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_MAILBOX       = "ordinace@buzalkova.cz"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI      = "mongodb://192.168.1.76:27017"
+MONGO_DB       = "emaily"
+MONGO_COL      = "ordinace@buzalkova.cz"
+BATCH_SIZE     = 100
+PAGE_SIZE      = 50
+LOG_FILE       = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION = "1.0"
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+IMPORTANCE_MAP  = {"low": 0, "normal": 1, "high": 2}
+FLAG_STATUS_MAP = {"notFlagged": 0, "flagged": 1, "complete": 2}
+RE_SUBJECT      = re.compile(r"^(RE|FW|AW|SV|VS|TR|WG|odpov[eě]d[ťt]|fwd?)[:\s]+", re.IGNORECASE)
+
+MSG_SELECT = (
+    "id,internetMessageId,subject,bodyPreview,body,"
+    "importance,isRead,isDraft,hasAttachments,"
+    "receivedDateTime,sentDateTime,createdDateTime,lastModifiedDateTime,"
+    "sender,from,toRecipients,ccRecipients,bccRecipients,replyTo,"
+    "conversationId,conversationIndex,parentFolderId,"
+    "categories,flag,inferenceClassification,internetMessageHeaders"
+)
+
+
+# ─── Graph API helpers ────────────────────────────────────────────────────────
+
+_graph_token: Optional[str] = None
+
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET failed after retry: {url}")
+
+
+def get_all_folders(parent_id: str = None, parent_path: str = "") -> list[dict]:
+    """Rekurzivne nacte vsechny slozky schranky. Vraci [{id, path}]."""
+    if parent_id is None:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders"
+    else:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{parent_id}/childFolders"
+
+    folders = []
+    params = {"$top": 100, "$select": "id,displayName,childFolderCount"}
+    while url:
+        data = graph_get(url, params)
+        for f in data.get("value", []):
+            path = f"{parent_path}/{f['displayName']}".lstrip("/")
+            folders.append({"id": f["id"], "path": path})
+            if f.get("childFolderCount", 0) > 0:
+                folders.extend(get_all_folders(f["id"], path))
+        url = data.get("@odata.nextLink")
+        params = None
+    return folders
+
+
+def iter_folder_messages(folder_id: str):
+    """Generator: vraci zpravy ze slozky po strankach."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{folder_id}/messages"
+    params = {"$top": PAGE_SIZE, "$select": MSG_SELECT, "$expand": "attachments"}
+    while url:
+        data = graph_get(url, params)
+        for msg in data.get("value", []):
+            yield msg
+        url = data.get("@odata.nextLink")
+        params = None
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def parse_date(raw) -> Optional[datetime]:
+    if raw is None:
+        return None
+    if isinstance(raw, datetime):
+        if raw.tzinfo:
+            return raw.astimezone(timezone.utc).replace(tzinfo=None)
+        return raw
+    try:
+        dt = dtparser.parse(str(raw))
+        if dt.tzinfo:
+            return dt.astimezone(timezone.utc).replace(tzinfo=None)
+        return dt
+    except Exception:
+        return None
+
+
+def normalize_subject(subject: str) -> str:
+    s = subject.strip()
+    while True:
+        m = RE_SUBJECT.match(s)
+        if not m:
+            break
+        s = s[m.end():].strip()
+    return s
+
+
+def parse_headers(raw_headers: list) -> dict:
+    result = {}
+    for h in raw_headers:
+        k = h["name"].lower().replace("-", "_")
+        v = h["value"]
+        if k in result:
+            existing = result[k]
+            if isinstance(existing, list):
+                existing.append(v)
+            else:
+                result[k] = [existing, v]
+        else:
+            result[k] = v
+    return result
+
+
+def format_recipients(lst: list) -> str:
+    return "; ".join(
+        f'{r["emailAddress"].get("name", "")} <{r["emailAddress"].get("address", "")}>'.strip()
+        for r in lst
+    )
+
+
+# ─── Hlavní extrakce ─────────────────────────────────────────────────────────
+
+def extract_message(msg: dict, folder_path: str) -> Optional[dict]:
+    try:
+        # _id
+        mid = (msg.get("internetMessageId") or "").strip()
+        if not mid:
+            mid = f"graphid:{msg['id']}"
+
+        subject = msg.get("subject") or ""
+        norm_subject = normalize_subject(subject)
+
+        # tělo
+        body_html = None
+        body_preview = msg.get("bodyPreview") or ""
+        body = msg.get("body", {})
+        if body.get("contentType") == "html":
+            content = body.get("content") or ""
+            body_html = content if len(content) <= 2 * 1024 * 1024 else content[:2 * 1024 * 1024]
+        elif body.get("contentType") == "text":
+            body_preview = (body.get("content") or "")[:2000]
+
+        # odesílatel
+        sender_ea = (msg.get("from") or msg.get("sender") or {}).get("emailAddress", {})
+        sender_email = sender_ea.get("address", "")
+        sender_name  = sender_ea.get("name", "")
+
+        # příjemci
+        to_list  = msg.get("toRecipients", [])
+        cc_list  = msg.get("ccRecipients", [])
+        bcc_list = msg.get("bccRecipients", [])
+
+        recipients = (
+            [{"type": "to",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in to_list] +
+            [{"type": "cc",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in cc_list] +
+            [{"type": "bcc", "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in bcc_list]
+        )
+
+        # příznaky
+        importance  = IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1)
+        flag_status = FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0)
+
+        # internet headers
+        raw_headers = msg.get("internetMessageHeaders") or []
+        headers = parse_headers(raw_headers)
+
+        in_reply_to = headers.get("in_reply_to", "")
+        if isinstance(in_reply_to, list):
+            in_reply_to = in_reply_to[0]
+
+        refs_raw = headers.get("references", "")
+        if isinstance(refs_raw, list):
+            refs_raw = " ".join(refs_raw)
+        internet_refs = [r.strip() for r in refs_raw.split() if r.strip()] if refs_raw else []
+
+        conv_topic = headers.get("thread_topic", "")
+        if isinstance(conv_topic, list):
+            conv_topic = conv_topic[0]
+
+        # conversation index
+        conv_index = ""
+        ci_raw = msg.get("conversationIndex")
+        if ci_raw:
+            try:
+                conv_index = base64.b64encode(base64.b64decode(ci_raw)).decode()
+            except Exception:
+                conv_index = ci_raw
+
+        # přílohy (jen metadata, bez obsahu)
+        attachments = []
+        for att in msg.get("attachments") or []:
+            fname = att.get("name") or ""
+            if not fname:
+                continue
+            attachments.append({
+                "filename":   fname,
+                "size_bytes": att.get("size", 0),
+                "mime_type":  att.get("contentType", "application/octet-stream"),
+                "content_id": att.get("contentId"),
+                "is_inline":  att.get("isInline", False),
+            })
+
+        return {
+            "_id":     mid,
+            "graph_id": msg["id"],
+
+            "subject":            subject,
+            "normalized_subject": norm_subject,
+            "importance":         importance,
+            "flag_status":        flag_status,
+            "is_read":            msg.get("isRead", False),
+            "is_draft":           msg.get("isDraft", False),
+            "has_attachments":    msg.get("hasAttachments", False),
+            "attachment_count":   len(attachments),
+            "inference_classification": msg.get("inferenceClassification", ""),
+            "categories":         msg.get("categories") or [],
+
+            "conversation_id":    msg.get("conversationId", ""),
+            "conversation_index": conv_index,
+            "conversation_topic": conv_topic,
+            "in_reply_to":        in_reply_to,
+            "internet_references": internet_refs,
+
+            "received_at": parse_date(msg.get("receivedDateTime")),
+            "sent_at":     parse_date(msg.get("sentDateTime")),
+            "created_at":  parse_date(msg.get("createdDateTime")),
+            "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+
+            "folder_id":   msg.get("parentFolderId", ""),
+            "folder_path": folder_path,
+
+            "sender": {
+                "email": sender_email,
+                "name":  sender_name,
+            },
+            "to":         format_recipients(to_list),
+            "cc":         format_recipients(cc_list),
+            "bcc":        format_recipients(bcc_list),
+            "recipients": recipients,
+
+            "body_html":    body_html,
+            "body_preview": body_preview,
+
+            "attachments": attachments,
+            "headers":     headers,
+
+            "parsed_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        }
+
+    except Exception as e:
+        logging.error("extract_message failed [%s]: %s", msg.get("id", "?"), e)
+        return None
+
+
+# ─── MongoDB indexy ───────────────────────────────────────────────────────────
+
+def create_indexes(col):
+    print("  Vytvarim indexy...")
+    col.create_index([("received_at",    ASCENDING)])
+    col.create_index([("sent_at",        ASCENDING)])
+    col.create_index([("sender.email",   ASCENDING)])
+    col.create_index([("graph_id",       ASCENDING)], unique=True, sparse=True)
+    col.create_index([("conversation_id", ASCENDING)])
+    col.create_index([("folder_path",    ASCENDING)])
+    col.create_index([("has_attachments", ASCENDING)])
+    col.create_index([("categories",     ASCENDING)])
+    col.create_index([("importance",     ASCENDING)])
+    col.create_index([("flag_status",    ASCENDING)])
+    col.create_index([("is_read",        ASCENDING)])
+    col.create_index([
+        ("subject",       TEXT),
+        ("body_preview",  TEXT),
+        ("to",            TEXT),
+        ("cc",            TEXT),
+    ], name="text_search", default_language="none")
+    print("  Indexy hotovy.")
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"parse_emails_graph v{SCRIPT_VERSION}")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N zprav (0 = vse)")
+    ap.add_argument("--skip-existing", action="store_true",
+                    help="Preskocit zpravy ktere jiz jsou v MongoDB")
+    ap.add_argument("--folder",        default="",
+                    help="Zpracovat jen slozku se zadanym nazvem (napr. Inbox)")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    start = datetime.now()
+    print(f"=== parse_emails_graph v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {GRAPH_MAILBOX}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{MONGO_COL}")
+
+    # Graph token
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    # MongoDB
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+    col = client[MONGO_DB][MONGO_COL]
+
+    # Skip existing
+    existing: set = set()
+    if args.skip_existing:
+        print("  Nacitam existujici zaznamy z MongoDB...")
+        existing = set(col.distinct("_id"))
+        print(f"  {len(existing)} jiz importovano")
+
+    # Slozky
+    print("\nNacitam seznam slozek...")
+    all_folders = get_all_folders()
+    if args.folder:
+        all_folders = [f for f in all_folders if args.folder.lower() in f["path"].lower()]
+    print(f"  Slozek ke zpracovani: {len(all_folders)}")
+    for f in all_folders:
+        print(f"    {f['path']}")
+
+    # Import
+    batch     = []
+    ok_count  = 0
+    err_count = 0
+    skip_count = 0
+    total_i   = 0
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    print()
+    for folder in all_folders:
+        print(f"--- Složka: {folder['path']} ---")
+        folder_count = 0
+
+        for msg in iter_folder_messages(folder["id"]):
+            if args.limit and total_i >= args.limit:
+                break
+
+            mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+
+            if mid in existing:
+                skip_count += 1
+                total_i += 1
+                continue
+
+            doc = extract_message(msg, folder["path"])
+            total_i += 1
+            folder_count += 1
+
+            if doc is None:
+                err_count += 1
+            else:
+                batch.append(UpdateOne({"_id": doc["_id"]}, {"$set": doc}, upsert=True))
+                ok_count += 1
+
+            if len(batch) >= BATCH_SIZE:
+                flush()
+
+            status      = "ERR " if doc is None else "OK  "
+            subject_str = (doc.get("subject") or "")[:60] if doc else "?"
+            sender_str  = (doc.get("sender", {}).get("email") or "")[:40] if doc else "?"
+            print(f"  {total_i:>6}  {status}  {subject_str:<60}  {sender_str}")
+
+            if total_i % 500 == 0:
+                elapsed = (datetime.now() - start).total_seconds()
+                rate    = total_i / elapsed if elapsed > 0 else 0
+                print(f"  {'─'*80}")
+                print(f"  Průběh: ok={ok_count}  skip={skip_count}  err={err_count}  {rate:.1f} msg/s")
+                print(f"  {'─'*80}")
+
+        flush()
+        print(f"  → {folder_count} zprav ze slozky {folder['path']}")
+
+        if args.limit and total_i >= args.limit:
+            break
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  ok={ok_count}  |  skip={skip_count}  |  err={err_count}")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"Dokumentu v kolekci: {col.count_documents({})}")
+
+    if not args.no_indexes:
+        print()
+        create_indexes(col)
+
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,611 @@
+"""
+parse_emails_graph_v1.3.py
+Nazev:  parse_emails_graph_v1.3.py
+Verze:  1.3
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Cte vsechny emaily z libovolne schranky primo pres Microsoft Graph API
+    a importuje je jako dokumenty do MongoDB.
+    Ze kazde zpravy extrahuje vsechny dostupne vlastnosti:
+
+        - predmet, odesilatel, prijemci (To/CC/BCC s typy)
+        - cas doruceni, odeslani, vytvoreni, modifikace (UTC)
+        - telo HTML (max 2 MB) + textovy preview
+        - prilohy (metadata: jmeno, velikost, MIME typ, inline flag, graph_att_id)
+        - internet headers (SPF, DKIM, Received, X-*, ...)
+        - MAPI-ekvivalenty: dulezitost, priznak, konverzacni vlakno,
+          kategorie, In-Reply-To, References, ...
+        - navic: isRead, isDraft, folder_path, inferenceClassification
+
+    Prochazi VSECHNY slozky schranky rekurzivne (Inbox, Sent, Deleted,
+    archivni slozky, ...).
+
+    DB:       emaily
+    Kolekce:  <mailbox> (napr. ordinace@buzalkova.cz)
+    _id:      Internet Message-ID (nebo "graphid:<id>" jako fallback)
+
+    POZOR: Skript pouze CIST ze schranky — zadny zapis do schranky!
+
+Spousteni:
+    # Prvni import (vsechno):
+    python parse_emails_graph_v1.3.py --mailbox ordinace@buzalkova.cz
+
+    # Test na prvnich 50:
+    python parse_emails_graph_v1.3.py --mailbox ordinace@buzalkova.cz --limit 50 --no-indexes
+
+    # Jen jedna slozka:
+    python parse_emails_graph_v1.3.py --mailbox ordinace@buzalkova.cz --folder Inbox
+
+    # Pokracovani po preruseni (pouze nove):
+    python parse_emails_graph_v1.3.py --mailbox ordinace@buzalkova.cz --mode new-only
+
+    # Pravidelny sync (aktualizuje is_read, flag, slozku; importuje nove):
+    python parse_emails_graph_v1.3.py --mailbox ordinace@buzalkova.cz --mode sync
+
+    # Jina schranka:
+    python parse_emails_graph_v1.3.py --mailbox vladimir.buzalka@buzalka.cz
+
+Rezimy (--mode):
+    full      Plny upsert vsech poli pro kazdou zpravu (vychozi)
+    new-only  Preskoci zpravy ktere uz jsou v MongoDB, importuje jen nove
+    sync      Existujici: aktualizuje jen is_read/flag_status/categories/
+              modified_at/folder_path. Nove zpravy importuje cely.
+              Idealni pro pravidelne spousteni.
+
+Zavislosti:
+    msal, requests, pymongo, python-dateutil
+    Python 3.10+
+
+Struktura dokumentu v MongoDB:
+    _id                     Internet Message-ID (nebo graphid: fallback)
+    graph_id                Graph API message ID
+    subject                 predmet zpravy
+    normalized_subject      predmet bez RE:/FW:/AW: prefixu
+    importance              0=nizka 1=normalni 2=vysoka
+    flag_status             0=bez priznaku 1=oznaceno 2=dokonceno
+    is_read                 bool — aktualni stav precteni ve schrance
+    is_draft                bool
+    has_attachments         bool
+    attachment_count        int
+    inference_classification focused / other
+    categories              [str]
+    conversation_id         Graph conversationId
+    conversation_index      base64 conversationIndex
+    conversation_topic      tema vlakna (z internet headers Thread-Topic)
+    in_reply_to             Message-ID predchozi zpravy
+    internet_references     [Message-ID]
+    received_at             datetime UTC
+    sent_at                 datetime UTC
+    created_at              datetime UTC
+    modified_at             datetime UTC
+    folder_id               Graph parentFolderId
+    folder_path             cela cesta slozky (napr. Inbox/Subfolder)
+    sender.email            emailova adresa odesilatele
+    sender.name             zobrazovane jmeno
+    to                      retezec To (joined)
+    cc                      retezec CC
+    bcc                     retezec BCC
+    recipients              [{type, email, name}]
+    body_html               HTML telo (max 2 MB)
+    body_preview            textovy nahled (max 255 znaku)
+    attachments             [{filename, size_bytes, mime_type, is_inline, graph_att_id}]
+    headers                 dict internet headers
+    parsed_at               datetime UTC
+
+Indexy:
+    received_at, sent_at, sender.email, graph_id (unique),
+    conversation_id, folder_path, has_attachments, categories,
+    importance, flag_status, is_read,
+    text_search (subject + body_preview + to + cc)
+
+Historie verzi:
+    1.0  2026-06-02  Inicialni verze
+    1.1  2026-06-02  Pridany rezimy --mode full/new-only/sync;
+                     odstranen --skip-existing (nahrazen --mode new-only)
+    1.2  2026-06-02  $expand attachments s $select (bez contentBytes — rychlejsi);
+                     prilohy ukladaji graph_att_id pro prime stazeni bez name-matchingu
+    1.3  2026-06-02  --mailbox jako povinny parametr — univerzalni pouziti pro
+                     libovolnou schranku; kolekce v MongoDB = nazev schranky
+"""
+
+import sys
+import re
+import logging
+import argparse
+import base64
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import msal
+import requests
+from dateutil import parser as dtparser
+from pymongo import MongoClient, UpdateOne, ASCENDING, TEXT
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+GRAPH_TENANT_ID     = "7d269944-37a4-43a1-8140-c7517dc426e9"
+GRAPH_CLIENT_ID     = "4b222bfd-78c9-4239-a53f-43006b3ed07f"
+GRAPH_CLIENT_SECRET = "Txg8Q~MjhocuopxsJyJBhPmDfMxZ2r5WpTFj1dfk"
+GRAPH_URL           = "https://graph.microsoft.com/v1.0"
+
+MONGO_URI      = "mongodb://192.168.1.76:27017"
+MONGO_DB       = "emaily"
+BATCH_SIZE     = 100
+PAGE_SIZE      = 50
+LOG_FILE       = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION = "1.3"
+
+# Schránka se nastavuje za behu z --mailbox parametru
+GRAPH_MAILBOX: str = ""
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+IMPORTANCE_MAP  = {"low": 0, "normal": 1, "high": 2}
+FLAG_STATUS_MAP = {"notFlagged": 0, "flagged": 1, "complete": 2}
+RE_SUBJECT      = re.compile(r"^(RE|FW|AW|SV|VS|TR|WG|odpov[eě]d[ťt]|fwd?)[:\s]+", re.IGNORECASE)
+
+# $expand prilohy bez contentBytes — jen metadata co potrebujeme
+ATT_EXPAND = "attachments($select=id,name,contentType,size,isInline)"
+
+MSG_SELECT = (
+    "id,internetMessageId,subject,bodyPreview,body,"
+    "importance,isRead,isDraft,hasAttachments,"
+    "receivedDateTime,sentDateTime,createdDateTime,lastModifiedDateTime,"
+    "sender,from,toRecipients,ccRecipients,bccRecipients,replyTo,"
+    "conversationId,conversationIndex,parentFolderId,"
+    "categories,flag,inferenceClassification,internetMessageHeaders"
+)
+
+MSG_SELECT_SYNC = (
+    "id,internetMessageId,isRead,isDraft,flag,categories,"
+    "lastModifiedDateTime,parentFolderId,importance"
+)
+
+
+# ─── Graph API helpers ────────────────────────────────────────────────────────
+
+_graph_token: Optional[str] = None
+
+
+def get_token() -> str:
+    global _graph_token
+    app = msal.ConfidentialClientApplication(
+        GRAPH_CLIENT_ID,
+        authority=f"https://login.microsoftonline.com/{GRAPH_TENANT_ID}",
+        client_credential=GRAPH_CLIENT_SECRET,
+    )
+    result = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])
+    if "access_token" not in result:
+        raise RuntimeError(f"Graph auth failed: {result}")
+    _graph_token = result["access_token"]
+    return _graph_token
+
+
+def graph_get(url: str, params: dict = None) -> dict:
+    global _graph_token
+    if not _graph_token:
+        get_token()
+    for attempt in range(2):
+        r = requests.get(url, headers={"Authorization": f"Bearer {_graph_token}"}, params=params, timeout=30)
+        if r.status_code == 401:
+            get_token()
+            continue
+        r.raise_for_status()
+        return r.json()
+    raise RuntimeError(f"Graph GET failed after retry: {url}")
+
+
+def get_all_folders(parent_id: str = None, parent_path: str = "") -> list[dict]:
+    """Rekurzivne nacte vsechny slozky schranky. Vraci [{id, path}]."""
+    if parent_id is None:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders"
+    else:
+        url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{parent_id}/childFolders"
+
+    folders = []
+    params = {"$top": 100, "$select": "id,displayName,childFolderCount"}
+    while url:
+        data = graph_get(url, params)
+        for f in data.get("value", []):
+            path = f"{parent_path}/{f['displayName']}".lstrip("/")
+            folders.append({"id": f["id"], "path": path})
+            if f.get("childFolderCount", 0) > 0:
+                folders.extend(get_all_folders(f["id"], path))
+        url = data.get("@odata.nextLink")
+        params = None
+    return folders
+
+
+def iter_folder_messages(folder_id: str, select: str = MSG_SELECT, expand_attachments: bool = True):
+    """Generator: vraci zpravy ze slozky po strankach."""
+    url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/mailFolders/{folder_id}/messages"
+    params = {"$top": PAGE_SIZE, "$select": select}
+    if expand_attachments:
+        params["$expand"] = ATT_EXPAND
+    while url:
+        data = graph_get(url, params)
+        for msg in data.get("value", []):
+            yield msg
+        url = data.get("@odata.nextLink")
+        params = None
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def parse_date(raw) -> Optional[datetime]:
+    if raw is None:
+        return None
+    if isinstance(raw, datetime):
+        if raw.tzinfo:
+            return raw.astimezone(timezone.utc).replace(tzinfo=None)
+        return raw
+    try:
+        dt = dtparser.parse(str(raw))
+        if dt.tzinfo:
+            return dt.astimezone(timezone.utc).replace(tzinfo=None)
+        return dt
+    except Exception:
+        return None
+
+
+def normalize_subject(subject: str) -> str:
+    s = subject.strip()
+    while True:
+        m = RE_SUBJECT.match(s)
+        if not m:
+            break
+        s = s[m.end():].strip()
+    return s
+
+
+def parse_headers(raw_headers: list) -> dict:
+    result = {}
+    for h in raw_headers:
+        k = h["name"].lower().replace("-", "_")
+        v = h["value"]
+        if k in result:
+            existing = result[k]
+            result[k] = existing + [v] if isinstance(existing, list) else [existing, v]
+        else:
+            result[k] = v
+    return result
+
+
+def format_recipients(lst: list) -> str:
+    return "; ".join(
+        f'{r["emailAddress"].get("name", "")} <{r["emailAddress"].get("address", "")}>'.strip()
+        for r in lst
+    )
+
+
+# ─── Extrakce zprávy ─────────────────────────────────────────────────────────
+
+def extract_message(msg: dict, folder_path: str) -> Optional[dict]:
+    """Plna extrakce — pouziva se pro mode full a nove zpravy v sync/new-only."""
+    try:
+        mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+        subject = msg.get("subject") or ""
+
+        body_html = None
+        body_preview = msg.get("bodyPreview") or ""
+        body = msg.get("body", {})
+        if body.get("contentType") == "html":
+            content = body.get("content") or ""
+            body_html = content if len(content) <= 2 * 1024 * 1024 else content[:2 * 1024 * 1024]
+        elif body.get("contentType") == "text":
+            body_preview = (body.get("content") or "")[:2000]
+
+        sender_ea    = (msg.get("from") or msg.get("sender") or {}).get("emailAddress", {})
+        to_list      = msg.get("toRecipients", [])
+        cc_list      = msg.get("ccRecipients", [])
+        bcc_list     = msg.get("bccRecipients", [])
+
+        recipients = (
+            [{"type": "to",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in to_list] +
+            [{"type": "cc",  "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in cc_list] +
+            [{"type": "bcc", "email": r["emailAddress"].get("address",""), "name": r["emailAddress"].get("name","")} for r in bcc_list]
+        )
+
+        importance  = IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1)
+        flag_status = FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0)
+
+        raw_headers   = msg.get("internetMessageHeaders") or []
+        headers       = parse_headers(raw_headers)
+
+        in_reply_to = headers.get("in_reply_to", "")
+        if isinstance(in_reply_to, list):
+            in_reply_to = in_reply_to[0]
+
+        refs_raw = headers.get("references", "")
+        if isinstance(refs_raw, list):
+            refs_raw = " ".join(refs_raw)
+        internet_refs = [r.strip() for r in refs_raw.split() if r.strip()] if refs_raw else []
+
+        conv_topic = headers.get("thread_topic", "")
+        if isinstance(conv_topic, list):
+            conv_topic = conv_topic[0]
+
+        conv_index = ""
+        ci_raw = msg.get("conversationIndex")
+        if ci_raw:
+            try:
+                conv_index = base64.b64encode(base64.b64decode(ci_raw)).decode()
+            except Exception:
+                conv_index = ci_raw
+
+        attachments = []
+        for att in msg.get("attachments") or []:
+            fname = att.get("name") or ""
+            if not fname:
+                continue
+            attachments.append({
+                "filename":     fname,
+                "size_bytes":   att.get("size", 0),
+                "mime_type":    att.get("contentType", "application/octet-stream"),
+                "is_inline":    att.get("isInline", False),
+                "graph_att_id": att.get("id"),
+            })
+
+        return {
+            "_id":      mid,
+            "graph_id": msg["id"],
+
+            "subject":            subject,
+            "normalized_subject": normalize_subject(subject),
+            "importance":         importance,
+            "flag_status":        flag_status,
+            "is_read":            msg.get("isRead", False),
+            "is_draft":           msg.get("isDraft", False),
+            "has_attachments":    msg.get("hasAttachments", False),
+            "attachment_count":   len(attachments),
+            "inference_classification": msg.get("inferenceClassification", ""),
+            "categories":         msg.get("categories") or [],
+
+            "conversation_id":     msg.get("conversationId", ""),
+            "conversation_index":  conv_index,
+            "conversation_topic":  conv_topic,
+            "in_reply_to":         in_reply_to,
+            "internet_references": internet_refs,
+
+            "received_at": parse_date(msg.get("receivedDateTime")),
+            "sent_at":     parse_date(msg.get("sentDateTime")),
+            "created_at":  parse_date(msg.get("createdDateTime")),
+            "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+
+            "folder_id":   msg.get("parentFolderId", ""),
+            "folder_path": folder_path,
+
+            "sender": {
+                "email": sender_ea.get("address", ""),
+                "name":  sender_ea.get("name", ""),
+            },
+            "to":         format_recipients(to_list),
+            "cc":         format_recipients(cc_list),
+            "bcc":        format_recipients(bcc_list),
+            "recipients": recipients,
+
+            "body_html":    body_html,
+            "body_preview": body_preview,
+
+            "attachments": attachments,
+            "headers":     headers,
+
+            "parsed_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        }
+
+    except Exception as e:
+        logging.error("extract_message failed [%s]: %s", msg.get("id", "?"), e)
+        return None
+
+
+def extract_sync_fields(msg: dict, folder_path: str) -> dict:
+    """Jen menitelna pole — pouziva se v sync mode pro existujici zpravy."""
+    return {
+        "is_read":    msg.get("isRead", False),
+        "is_draft":   msg.get("isDraft", False),
+        "flag_status": FLAG_STATUS_MAP.get((msg.get("flag") or {}).get("flagStatus", "notFlagged"), 0),
+        "importance":  IMPORTANCE_MAP.get(msg.get("importance", "normal"), 1),
+        "categories":  msg.get("categories") or [],
+        "modified_at": parse_date(msg.get("lastModifiedDateTime")),
+        "folder_id":   msg.get("parentFolderId", ""),
+        "folder_path": folder_path,
+        "parsed_at":   datetime.now(timezone.utc).replace(tzinfo=None),
+    }
+
+
+# ─── MongoDB indexy ───────────────────────────────────────────────────────────
+
+def create_indexes(col):
+    print("  Vytvarim indexy...")
+    col.create_index([("received_at",     ASCENDING)])
+    col.create_index([("sent_at",         ASCENDING)])
+    col.create_index([("sender.email",    ASCENDING)])
+    col.create_index([("graph_id",        ASCENDING)], unique=True, sparse=True)
+    col.create_index([("conversation_id", ASCENDING)])
+    col.create_index([("folder_path",     ASCENDING)])
+    col.create_index([("has_attachments", ASCENDING)])
+    col.create_index([("categories",      ASCENDING)])
+    col.create_index([("importance",      ASCENDING)])
+    col.create_index([("flag_status",     ASCENDING)])
+    col.create_index([("is_read",         ASCENDING)])
+    col.create_index([
+        ("subject",      TEXT),
+        ("body_preview", TEXT),
+        ("to",           TEXT),
+        ("cc",           TEXT),
+    ], name="text_search", default_language="none")
+    print("  Indexy hotovy.")
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    global GRAPH_MAILBOX
+
+    ap = argparse.ArgumentParser(description=f"parse_emails_graph v{SCRIPT_VERSION}")
+    ap.add_argument("--mailbox",    required=True,
+                    help="Emailova schranka (napr. ordinace@buzalkova.cz)")
+    ap.add_argument("--mode", default="full", choices=["full", "new-only", "sync"],
+                    help="full=plny upsert (vychozi) | new-only=jen nove zpravy | "
+                         "sync=existujici aktualizuje jen menitelna pole, nove importuje cely")
+    ap.add_argument("--limit",      type=int, default=0,
+                    help="Zpracovat max N zprav (0 = vse)")
+    ap.add_argument("--folder",     default="",
+                    help="Zpracovat jen slozku se zadanym nazvem (napr. Inbox)")
+    ap.add_argument("--no-indexes", action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    GRAPH_MAILBOX = args.mailbox
+    mongo_col     = args.mailbox
+
+    start = datetime.now()
+    print(f"=== parse_emails_graph v{SCRIPT_VERSION} ===")
+    print(f"Start:    {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Schránka: {GRAPH_MAILBOX}")
+    print(f"MongoDB:  {MONGO_URI} -> {MONGO_DB}.{mongo_col}")
+    print(f"Režim:    {args.mode}")
+
+    print("\nPřipojuji se k Graph API...")
+    try:
+        get_token()
+        print("  Graph API OK")
+    except Exception as e:
+        print(f"  CHYBA: {e}")
+        sys.exit(1)
+
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+    col = client[MONGO_DB][mongo_col]
+
+    existing: set = set()
+    if args.mode in ("new-only", "sync"):
+        print("  Nacitam existujici zaznamy z MongoDB...")
+        existing = set(col.distinct("_id"))
+        print(f"  {len(existing)} jiz importovano")
+
+    print("\nNacitam seznam slozek...")
+    all_folders = get_all_folders()
+    if args.folder:
+        all_folders = [f for f in all_folders if args.folder.lower() in f["path"].lower()]
+    print(f"  Slozek ke zpracovani: {len(all_folders)}")
+    for f in all_folders:
+        print(f"    {f['path']}")
+
+    is_sync    = args.mode == "sync"
+    msg_select = MSG_SELECT_SYNC if is_sync else MSG_SELECT
+    expand_att = not is_sync
+
+    batch      = []
+    ok_count   = 0
+    sync_count = 0
+    err_count  = 0
+    skip_count = 0
+    total_i    = 0
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    print()
+    for folder in all_folders:
+        print(f"--- Složka: {folder['path']} ---")
+        folder_count = 0
+
+        for msg in iter_folder_messages(folder["id"], select=msg_select, expand_attachments=expand_att):
+            if args.limit and total_i >= args.limit:
+                break
+
+            mid = (msg.get("internetMessageId") or "").strip() or f"graphid:{msg['id']}"
+            total_i += 1
+            folder_count += 1
+
+            if args.mode == "new-only" and mid in existing:
+                skip_count += 1
+                continue
+
+            if is_sync and mid in existing:
+                fields = extract_sync_fields(msg, folder["path"])
+                batch.append(UpdateOne({"_id": mid}, {"$set": fields}))
+                sync_count += 1
+                print(f"  {total_i:>6}  SYN   {mid[:80]}")
+            else:
+                if is_sync:
+                    full_url = f"{GRAPH_URL}/users/{GRAPH_MAILBOX}/messages/{msg['id']}"
+                    full_params = {"$select": MSG_SELECT, "$expand": ATT_EXPAND}
+                    try:
+                        msg = graph_get(full_url, full_params)
+                    except Exception as e:
+                        logging.error("full fetch failed [%s]: %s", msg.get("id","?"), e)
+                        err_count += 1
+                        continue
+
+                doc = extract_message(msg, folder["path"])
+                if doc is None:
+                    err_count += 1
+                    print(f"  {total_i:>6}  ERR   {mid[:80]}")
+                else:
+                    batch.append(UpdateOne({"_id": doc["_id"]}, {"$set": doc}, upsert=True))
+                    ok_count += 1
+                    subject_str = (doc.get("subject") or "")[:60]
+                    sender_str  = (doc.get("sender", {}).get("email") or "")[:40]
+                    print(f"  {total_i:>6}  OK    {subject_str:<60}  {sender_str}")
+
+            if len(batch) >= BATCH_SIZE:
+                flush()
+
+            if total_i % 500 == 0:
+                elapsed = (datetime.now() - start).total_seconds()
+                rate    = total_i / elapsed if elapsed > 0 else 0
+                print(f"  {'─'*80}")
+                print(f"  Průběh: ok={ok_count}  sync={sync_count}  skip={skip_count}  err={err_count}  {rate:.1f} msg/s")
+                print(f"  {'─'*80}")
+
+        flush()
+        print(f"  → {folder_count} zprav ze slozky {folder['path']}")
+
+        if args.limit and total_i >= args.limit:
+            break
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  ok={ok_count}  |  sync={sync_count}  |  skip={skip_count}  |  err={err_count}")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"Dokumentu v kolekci: {col.count_documents({})}")
+
+    if not args.no_indexes:
+        print()
+        create_indexes(col)
+
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,248 @@
+# parse_emails_tower_v1.1
+
+## Spuštění
+
+**První spuštění:**
+```bash
+docker exec -d python-runner bash -c \
+  "python /scripts/parse_emails_tower_v1.1.py > /scripts/parse_emails.log 2>&1"
+```
+
+**Pokračování po přerušení (přeskočí už importované):**
+```bash
+docker exec -d python-runner bash -c \
+  "python /scripts/parse_emails_tower_v1.1.py --skip-existing > /scripts/parse_emails.log 2>&1"
+```
+
+---
+
+## Stav importu
+
+**Sledování průběhu (live log):**
+```bash
+docker exec -it python-runner tail -f /scripts/parse_emails.log
+```
+
+**Počet emailů v MongoDB:**
+```bash
+docker exec -it python-runner python -c \
+  "from pymongo import MongoClient; c=MongoClient('mongodb://192.168.1.76:27017'); print(c['emaily']['vbuzalka@its.jnj.com'].count_documents({}))"
+```
+
+---
+
+**Název:** parse_emails_tower_v1.1.py  
+**Verze:** 1.1  
+**Datum:** 2026-06-02  
+**Autor:** vladimir.buzalka  
+
+---
+
+## Účel
+
+Import všech `.msg` souborů do MongoDB. Z každého souboru extrahuje **všechny dostupné vlastnosti** — podobně jako EXIF u fotek.
+
+- **DB:** `emaily`  
+- **Kolekce:** `vbuzalka@its.jnj.com`  
+- `_id` = Internet Message-ID (nebo `filename:<stem>` jako fallback)  
+- Bezpečné přerušit a opakovat — upsert podle `_id`
+
+---
+
+## Prostředí
+
+Běží v Docker containeru **python-runner** na **Unraid Tower**.
+
+| Komponenta | Umístění |
+|---|---|
+| Container | `python-runner` (Docker na Unraid Tower) |
+| .msg soubory | `/mnt/user/JNJEMAILS` → `/mnt/JNJEMAILS` uvnitř containeru |
+| Skripty | `/mnt/user/Scripts` → `/scripts` uvnitř containeru |
+| MongoDB | `192.168.1.76:27017` (externí, mimo container) |
+
+---
+
+## Spouštění (z Unraid terminálu)
+
+**Test na 50 emailech:**
+```bash
+docker exec -it python-runner python /scripts/parse_emails_tower_v1.1.py --limit 50 --no-indexes
+```
+
+**Kompletní import na pozadí (log do souboru):**
+```bash
+docker exec -d python-runner bash -c \
+  "python /scripts/parse_emails_tower_v1.1.py > /scripts/parse_emails.log 2>&1"
+```
+
+**Pokračování po přerušení:**
+```bash
+docker exec -d python-runner bash -c \
+  "python /scripts/parse_emails_tower_v1.1.py --skip-existing > /scripts/parse_emails.log 2>&1"
+```
+
+**Sledování průběhu (Ctrl+C ukončí sledování, import běží dál):**
+```bash
+docker exec -it python-runner tail -f /scripts/parse_emails.log
+```
+
+### Všechny parametry
+
+| Parametr | Popis |
+|---|---|
+| `--skip-existing` | Načte seznam hotových souborů z MongoDB a přeskočí je. Použij pro pokračování po přerušení. |
+| `--limit N` | Zpracuje jen prvních N souborů. Vhodné pro test. |
+| `--no-indexes` | Nevytváří indexy na konci. Použij pokud přerušíš uprostřed — indexy vytvoř ručně až je vše hotové. |
+| `--msgs-dir PATH` | Přepíše výchozí cestu k .msg souborům (výchozí: `/mnt/JNJEMAILS`). |
+
+---
+
+## Průběh na konzoli
+
+Každý email na jednom řádku:
+```
+       1/69371  OK    RE: Protocol deviation CZ10022                    jan.novak@its.jnj.com
+       2/69371  OK    UCO3001: Draft FUL pro DD5-CZ10022                monitor@4gclinical.com
+       3/69371  ERR   ?                                                  ?
+```
+
+Každých 500 emailů oddělovač s průběhem:
+```
+  ────────────────────────────────────────────────────────────────────────────────
+  Průběh: ok=498  err=2  0.4 msg/s  ETA 47h12m
+  ────────────────────────────────────────────────────────────────────────────────
+```
+
+Na konci souhrn:
+```
+====================================================
+Vysledek:  ok=69300  |  skip=0  |  err=71
+Celkovy cas: 47h 23m 10s
+Dokumentu v kolekci: 69300
+```
+
+---
+
+## Zdroje dat z každého .msg
+
+| Pole | Popis |
+|---|---|
+| Předmět, normalized subject | |
+| Odesílatel | email, jméno, SMTP adresa |
+| Příjemci To/CC/BCC | strukturovaně `[{type, email, name}]` |
+| Čas doručení a odeslání | UTC |
+| Tělo | plaintext + HTML (max 2 MB) |
+| Přílohy | metadata: jméno, velikost, MIME typ, inline flag |
+| Internet headers | X-Originating-IP, Received, DKIM, X-Mailer, ... |
+| MAPI | důležitost, citlivost, příznak, konverzační vlákno, kategorie |
+| In-Reply-To, References | pro rekonstrukci vlákna |
+| Raw MAPI properties | `{0xXXXX: value}` |
+
+---
+
+## Hodnotové kódy
+
+| Pole | Hodnota | Význam |
+|---|---|---|
+| `importance` | 0 | Nízká |
+| | 1 | Normální |
+| | 2 | Vysoká |
+| `sensitivity` | 0 | Normální |
+| | 1 | Osobní |
+| | 2 | Soukromé |
+| | 3 | Důvěrné |
+| `flag_status` | 0 | Bez příznaku |
+| | 1 | Označeno (follow up) |
+| | 2 | Dokončeno |
+
+---
+
+## MongoDB indexy
+
+Automaticky vytvořeny na konci importu (`--no-indexes` přeskočí):
+
+| Index | Pole |
+|---|---|
+| Chronologický | `received_at`, `sent_at` |
+| Odesílatel | `sender.email` |
+| Soubor | `filename` (unique) |
+| Konverzace | `conversation_topic` |
+| Filtry | `has_attachments`, `categories`, `importance`, `flag_status` |
+| Full-text | `subject` + `body_text` + `to` + `cc` (text index `text_search`) |
+
+---
+
+## Ukázkové dotazy (MongoDB shell / MCP)
+
+**Emaily o UCO3001 s přílohou:**
+```javascript
+db["vbuzalka@its.jnj.com"].find({
+  $text: { $search: "UCO3001" },
+  has_attachments: true
+}).sort({ received_at: -1 })
+```
+
+**Emaily od konkrétního odesílatele:**
+```javascript
+db["vbuzalka@its.jnj.com"].find({
+  "sender.email": /covance/i
+}).sort({ received_at: -1 })
+```
+
+**Celé konverzační vlákno:**
+```javascript
+db["vbuzalka@its.jnj.com"].find({
+  conversation_topic: "Protocol deviation CZ10022"
+}).sort({ received_at: 1 })
+```
+
+**Statistiky podle odesílatele (top 20):**
+```javascript
+db["vbuzalka@its.jnj.com"].aggregate([
+  { $group: { _id: "$sender.email", count: { $sum: 1 } } },
+  { $sort: { count: -1 } },
+  { $limit: 20 }
+])
+```
+
+---
+
+## Chybový log
+
+Soubory které selhaly jsou zalogrovány do `parse_emails_errors.log` vedle skriptu (tj. `/scripts/parse_emails_errors.log` → `\\tower\Scripts\parse_emails_errors.log`):
+```
+2026-06-02 20:14:33 | open failed [7A3F...0000.msg]: <důvod>
+```
+
+---
+
+## Výkon
+
+| Parametr | Hodnota |
+|---|---|
+| Počet souborů | ~69 000 |
+| Rychlost | ~0.4 msg/s (htmlBody dekódování) |
+| Odhadovaný čas | 48 hodin |
+| Batch size | 200 dokumentů / bulk_write |
+| Odhadovaná velikost DB | 2–5 GB |
+
+---
+
+## Závislosti (v Docker image python-runner)
+
+```
+extract-msg==0.55.0
+pymongo
+python-dateutil
+```
+
+Image sestaven z `Dockerfile` v `/mnt/user/Scripts/python-runner/`.
+
+---
+
+## Historie verzí
+
+| Verze | Datum | Změna |
+|---|---|---|
+| 1.0 | 2026-06-01 | Iniciální verze |
+| 1.1 | 2026-06-02 | Nasazení na Unraid Tower v Docker containeru python-runner; MSGS_DIR změněno z SMB share (`\\tower\JNJEMAILS`) na lokální mount (`/mnt/JNJEMAILS`); aktualizován popis spouštění pro `docker exec` |
@@ -0,0 +1,660 @@
+"""
+parse_emails_tower_v1.1.py
+Nazev:  parse_emails_tower_v1.1.py
+Verze:  1.1
+Datum:  2026-06-02
+Autor:  vladimir.buzalka
+
+Popis:
+    Parsuje vsechny .msg soubory z MSGS_DIR a importuje je jako dokumenty
+    do MongoDB. Z kazdeho souboru extrahuje VSECHNY dostupne vlastnosti —
+    podobne jako EXIF u fotek:
+
+        - predmet, odesilatel, prijemci (To/CC/BCC s typy)
+        - cas doruceni a odeslani (UTC)
+        - telo plaintext + HTML (max 2 MB)
+        - prilohy (metadata: jmeno, velikost, MIME typ, inline flag)
+        - internet headers (X-Originating-IP, Received, DKIM, ...)
+        - MAPI vlastnosti: dulezitost, citlivost, priznak, konverzacni vlakno,
+          kategorie, In-Reply-To, References, ...
+        - vsechny raw MAPI properties jako {0xXXXX: value}
+
+    DB:       emaily
+    Kolekce:  vbuzalka@its.jnj.com
+    _id:      Internet Message-ID (nebo "filename:<stem>" jako fallback)
+
+    Bezpecne prerusit a opakovat:
+        - upsert podle _id — duplicity se automaticky prepisi
+        - --skip-existing nacte seznam hotovych souboru z MongoDB a
+          preskoci je => pokracovani po preruseni bez ztraty prace
+
+Prostredi:
+    Bezi v Docker containeru "python-runner" na Unraid Tower.
+    .msg soubory jsou dostupne jako lokalni disk (volume mount):
+        /mnt/user/JNJEMAILS  ->  /mnt/JNJEMAILS  (uvnitr containeru)
+    MongoDB na 192.168.1.76:27017 (externi, bezi mimo container).
+
+Spousteni (z Unraid terminalu):
+    # Test na 50 emailech:
+    docker exec -it python-runner python /scripts/parse_emails_tower_v1.1.py --limit 50 --no-indexes
+
+    # Kompletni import na pozadi (log do souboru):
+    docker exec -d python-runner bash -c \
+      "python /scripts/parse_emails_tower_v1.1.py > /scripts/parse_emails.log 2>&1"
+
+    # Pokracovani po preruseni:
+    docker exec -d python-runner bash -c \
+      "python /scripts/parse_emails_tower_v1.1.py --skip-existing > /scripts/parse_emails.log 2>&1"
+
+    # Sledovani prubehu:
+    docker exec -it python-runner tail -f /scripts/parse_emails.log
+
+Vystup na konzoli:
+    Kazdy email na jednom radku:
+        <poradi>/<celkem>  OK/ERR  <predmet 60 znaku>  <odesilatel>
+    Kazych 500 emailu: oddelovac s prubehem, rychlosti a ETA.
+    Na konci: souhrn ok/skip/err, celkovy cas, pocet dokumentu v kolekci.
+
+Zavislosti (nainstalovane v Docker image python-runner):
+    extract-msg==0.55.0, pymongo, python-dateutil
+    Python 3.12, Linux (Docker container na Unraid Tower)
+
+Struktura dokumentu v MongoDB:
+    _id                     Internet Message-ID (nebo filename: fallback)
+    filename                jmeno .msg souboru (20znakovy hex + .msg)
+    subject                 predmet zpravy
+    normalized_subject      predmet bez RE:/FW: prefixu
+    importance              0=nizka 1=normalni 2=vysoka
+    sensitivity             0=normalni 1=osobni 2=soukrome 3=duverne
+    flag_status             0=bez priznaku 1=oznaceno 2=dokonceno
+    read_receipt_requested  bool
+    delivery_receipt_requested bool
+    has_attachments         bool
+    attachment_count        int
+    message_size_bytes      velikost .msg souboru na disku
+    conversation_topic      tema vlakna (PR_CONVERSATION_TOPIC)
+    conversation_index      base64 PR_CONVERSATION_INDEX
+    in_reply_to             Message-ID predchozi zpravy
+    internet_references     [Message-ID] — cela historia vlakna
+    categories              [str] — MAPI kategorie / stitky
+    read_receipt_requested  bool
+    delivery_receipt_requested bool
+    received_at             datetime UTC — cas doruceni
+    sent_at                 datetime UTC — cas odeslani
+    sender.email            emailova adresa odesilatele
+    sender.name             zobrazovane jmeno odesilatele
+    sender.smtp             SMTP adresa (pro interni EX adresy)
+    to                      retezec To (tak jak v Outlooku)
+    cc                      retezec CC
+    bcc                     retezec BCC
+    display_to              PR_DISPLAY_TO (zkraceny seznam)
+    display_cc              PR_DISPLAY_CC
+    recipients              [{type, email, name}] — to/cc/bcc s typy
+    body_text               plain text telo
+    body_html               HTML telo (max 2 MB, None pokud neni)
+    attachments             [{filename, size_bytes, mime_type,
+                              content_id, is_inline}]
+    headers                 dict internet headers (lowercase_s_podtrzitky)
+    mapi                    dict vsech raw MAPI properties {0xXXXX: value}
+    parsed_at               datetime UTC — cas parsovani
+
+Indexy (vytvoreny automaticky na konci):
+    received_at, sent_at, sender.email, filename (unique),
+    conversation_topic, has_attachments, categories, importance,
+    flag_status, text_search (subject + body_text + to + cc)
+
+Chyby:
+    Soubory ktere selhaly jsou zalogiovany do parse_emails_errors.log
+    v adresari skriptu. Radek: timestamp | open/extract failed | duvod.
+
+Historie verzi:
+    1.0  2026-06-01  Inicialni verze
+    1.1  2026-06-02  Nasazeni na Unraid Tower v Docker containeru python-runner;
+                     MSGS_DIR zmeneno z SMB share na lokalni mount /mnt/JNJEMAILS;
+                     aktualizovany popis spousteni pro docker exec
+"""
+
+import sys
+import re
+import logging
+import argparse
+import base64
+from pathlib import Path
+from datetime import datetime, timezone
+from typing import Optional
+
+import extract_msg
+from dateutil import parser as dtparser
+from pymongo import MongoClient, UpdateOne, ASCENDING, TEXT
+
+if hasattr(sys.stdout, "reconfigure"):
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+# ─── KONFIGURACE ──────────────────────────────────────────────────────────────
+MSGS_DIR       = Path("/mnt/JNJEMAILS")
+MONGO_URI      = "mongodb://192.168.1.76:27017"
+MONGO_DB       = "emaily"
+MONGO_COL      = "vbuzalka@its.jnj.com"
+BATCH_SIZE     = 200
+LOG_FILE       = Path(__file__).parent / "parse_emails_errors.log"
+SCRIPT_VERSION = "1.1"
+# ──────────────────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    filename=str(LOG_FILE),
+    level=logging.ERROR,
+    format="%(asctime)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    encoding="utf-8",
+)
+
+
+# ─── Pomocné funkce ───────────────────────────────────────────────────────────
+
+def safe(obj, *attrs, default=None):
+    """Bezpecne cteni atributu — vrati prvni non-None hodnotu."""
+    for attr in attrs:
+        try:
+            val = getattr(obj, attr, None)
+            if val is None:
+                continue
+            if isinstance(val, str) and not val.strip():
+                continue
+            return val
+        except Exception:
+            continue
+    return default
+
+
+def parse_date(raw) -> Optional[datetime]:
+    """Libovolny datum -> UTC datetime bez tzinfo (pro MongoDB)."""
+    if raw is None:
+        return None
+    if isinstance(raw, datetime):
+        if raw.tzinfo:
+            return raw.astimezone(timezone.utc).replace(tzinfo=None)
+        return raw
+    try:
+        dt = dtparser.parse(str(raw))
+        if dt.tzinfo:
+            return dt.astimezone(timezone.utc).replace(tzinfo=None)
+        return dt
+    except Exception:
+        return None
+
+
+def to_bson(val):
+    """Konvertuje hodnotu na BSON-serializovatelny typ."""
+    if isinstance(val, bytes):
+        return val.hex() if len(val) <= 128 else f"<bytes:{len(val)}>"
+    if isinstance(val, datetime):
+        return parse_date(val)
+    if isinstance(val, (str, int, float, bool, type(None))):
+        return val
+    if isinstance(val, list):
+        return [to_bson(v) for v in val]
+    try:
+        return int(val)
+    except Exception:
+        pass
+    return str(val)
+
+
+# ─── Extrakce částí zprávy ────────────────────────────────────────────────────
+
+def extract_headers(msg) -> dict:
+    headers = {}
+    try:
+        hdr = msg.header
+        if not hdr:
+            return {}
+        from email.header import decode_header as _dh
+
+        def _decode(v: str) -> str:
+            try:
+                parts = _dh(v)
+                out = ""
+                for part, enc in parts:
+                    out += part.decode(enc or "utf-8", errors="replace") if isinstance(part, bytes) else part
+                return out
+            except Exception:
+                return v
+
+        for key in set(hdr.keys()):
+            k = key.lower().replace("-", "_")
+            vals = [_decode(v) for v in hdr.get_all(key, [])]
+            headers[k] = vals if len(vals) > 1 else (vals[0] if vals else "")
+    except Exception as e:
+        logging.error("extract_headers: %s", e)
+    return headers
+
+
+def extract_recipients(msg) -> list:
+    result = []
+    type_map = {1: "to", 2: "cc", 3: "bcc"}
+    try:
+        for r in msg.recipients:
+            rtype = getattr(r, "type", 1)
+            try:
+                rtype = int(rtype)
+            except Exception:
+                try:
+                    rtype = int(rtype.value)
+                except Exception:
+                    rtype = 1
+            rec = {
+                "type":  type_map.get(rtype, "to"),
+                "email": safe(r, "email", default=""),
+                "name":  safe(r, "name",  default=""),
+            }
+            result.append(rec)
+    except Exception as e:
+        logging.error("extract_recipients: %s", e)
+    return result
+
+
+def extract_attachments(msg) -> list:
+    result = []
+    try:
+        for att in msg.attachments:
+            fname = safe(att, "longFilename", "shortFilename", default="")
+            if not fname:
+                continue
+            size = 0
+            try:
+                d = att.data
+                size = len(d) if d else 0
+            except Exception:
+                pass
+            result.append({
+                "filename":   fname,
+                "size_bytes": size,
+                "mime_type":  safe(att, "mimetype", "mimeType", default="application/octet-stream"),
+                "content_id": safe(att, "cid", default=None),
+                "is_inline":  bool(safe(att, "isInline", default=False)),
+            })
+    except Exception as e:
+        logging.error("extract_attachments: %s", e)
+    return result
+
+
+def extract_mapi_props(msg) -> dict:
+    """Vsechny raw MAPI properties jako {0xXXXX: value}."""
+    result = {}
+    try:
+        props = msg.props
+        if not hasattr(props, "items"):
+            return {}
+        for key, prop in props.items():
+            try:
+                val = to_bson(prop.value)
+                prop_id = f"0x{key[:4].upper()}" if len(key) >= 4 else f"0x{key.upper()}"
+                result[prop_id] = val
+            except Exception:
+                pass
+    except Exception as e:
+        logging.error("extract_mapi_props: %s", e)
+    return result
+
+
+# ─── Hlavní extrakce ─────────────────────────────────────────────────────────
+
+def extract_message(msg_path: Path) -> Optional[dict]:
+    """Parsuje jeden .msg soubor -> MongoDB dokument."""
+    try:
+        msg = extract_msg.Message(str(msg_path))
+    except Exception as e:
+        logging.error("open failed [%s]: %s", msg_path.name, e)
+        return None
+
+    try:
+        # ── Message-ID ────────────────────────────────────────────────
+        mid = None
+        for attr in ("messageId", "message_id", "internetMessageId"):
+            mid = safe(msg, attr)
+            if mid:
+                break
+        if not mid:
+            mid = f"filename:{msg_path.stem}"
+        mid = str(mid).strip()
+
+        # ── Předmět ───────────────────────────────────────────────────
+        try:
+            subject = msg.subject or ""
+        except Exception:
+            subject = ""
+
+        normalized_subject = safe(msg, "normalizedSubject", "normalized_subject", default="")
+
+        # ── Tělo ──────────────────────────────────────────────────────
+        try:
+            body_text = msg.body or ""
+        except Exception:
+            body_text = ""
+
+        body_html = None
+        try:
+            bh = msg.htmlBody
+            if isinstance(bh, bytes):
+                bh = bh.decode("utf-8", errors="replace")
+            if bh:
+                body_html = bh if len(bh) <= 2 * 1024 * 1024 else bh[:2 * 1024 * 1024]
+        except Exception:
+            pass
+
+        # ── Odesílatel ────────────────────────────────────────────────
+        try:
+            sender_email = msg.sender or ""
+        except Exception:
+            sender_email = ""
+
+        sender_name = safe(msg, "senderName", "sender_name", default="")
+        sender_smtp = safe(msg, "senderSmtpAddress", "sent_representing_smtp_address", default="")
+
+        # ── Příjemci ──────────────────────────────────────────────────
+        recipients = extract_recipients(msg)
+
+        try:
+            to_raw = msg.to or ""
+        except Exception:
+            to_raw = ""
+        try:
+            cc_raw = msg.cc or ""
+        except Exception:
+            cc_raw = ""
+        try:
+            bcc_raw = getattr(msg, "bcc", None) or ""
+        except Exception:
+            bcc_raw = ""
+
+        display_to = safe(msg, "displayTo",  "display_to",  default="")
+        display_cc = safe(msg, "displayCc",  "display_cc",  default="")
+
+        # ── Časy ──────────────────────────────────────────────────────
+        try:
+            received_at = parse_date(msg.date)
+        except Exception:
+            received_at = None
+
+        sent_at = None
+        for attr in ("clientSubmitTime", "client_submit_time", "sentOn"):
+            v = safe(msg, attr)
+            if v:
+                sent_at = parse_date(v)
+                break
+
+        # ── MAPI vlastnosti ───────────────────────────────────────────
+        importance = 1
+        try:
+            v = msg.importance
+            if v is not None:
+                importance = int(v)
+        except Exception:
+            pass
+
+        sensitivity = 0
+        try:
+            v = getattr(msg, "sensitivity", None)
+            if v is not None:
+                sensitivity = int(v)
+        except Exception:
+            pass
+
+        flag_status = 0
+        try:
+            v = safe(msg, "flagStatus", "flag_status")
+            if v is not None:
+                flag_status = int(v)
+        except Exception:
+            pass
+
+        conversation_topic = safe(msg, "conversationTopic", "conversation_topic", default="")
+
+        conversation_index = ""
+        try:
+            ci = safe(msg, "conversationIndex", "conversation_index")
+            if isinstance(ci, bytes):
+                conversation_index = base64.b64encode(ci).decode()
+            elif ci:
+                conversation_index = str(ci)
+        except Exception:
+            pass
+
+        in_reply_to = safe(msg, "inReplyTo", "in_reply_to", default="")
+
+        internet_refs = []
+        try:
+            refs = safe(msg, "internetReferences", "internet_references")
+            if isinstance(refs, list):
+                internet_refs = refs
+            elif isinstance(refs, str) and refs:
+                internet_refs = [r.strip() for r in refs.split() if r.strip()]
+        except Exception:
+            pass
+
+        categories = []
+        try:
+            cats = safe(msg, "categories")
+            if isinstance(cats, list):
+                categories = [str(c) for c in cats if c]
+            elif isinstance(cats, str) and cats:
+                categories = [c.strip() for c in re.split(r"[;,]", cats) if c.strip()]
+        except Exception:
+            pass
+
+        read_receipt     = bool(safe(msg, "readReceiptRequested",    "read_receipt_requested",    default=False))
+        delivery_receipt = bool(safe(msg, "deliveryReceiptRequested", "delivery_receipt_requested", default=False))
+
+        # ── Internet headers ──────────────────────────────────────────
+        headers = extract_headers(msg)
+
+        if not in_reply_to:
+            in_reply_to = headers.get("in_reply_to", "")
+        if not internet_refs:
+            refs_str = headers.get("references", "")
+            if isinstance(refs_str, str) and refs_str:
+                internet_refs = [r.strip() for r in refs_str.split() if r.strip()]
+
+        # ── Přílohy ───────────────────────────────────────────────────
+        attachments = extract_attachments(msg)
+
+        # ── Raw MAPI ──────────────────────────────────────────────────
+        mapi_raw = extract_mapi_props(msg)
+
+        msg.close()
+
+        # ── Dokument ──────────────────────────────────────────────────
+        return {
+            "_id":      mid,
+            "filename": msg_path.name,
+
+            "subject":            subject,
+            "normalized_subject": normalized_subject,
+            "importance":         importance,
+            "sensitivity":        sensitivity,
+            "flag_status":        flag_status,
+            "read_receipt_requested":     read_receipt,
+            "delivery_receipt_requested": delivery_receipt,
+            "has_attachments":    len(attachments) > 0,
+            "attachment_count":   len(attachments),
+            "message_size_bytes": msg_path.stat().st_size,
+
+            "conversation_topic":  conversation_topic,
+            "conversation_index":  conversation_index,
+            "in_reply_to":         in_reply_to,
+            "internet_references": internet_refs,
+            "categories":          categories,
+
+            "received_at": received_at,
+            "sent_at":     sent_at,
+
+            "sender": {
+                "email": sender_email,
+                "name":  sender_name,
+                "smtp":  sender_smtp,
+            },
+            "to":         to_raw,
+            "cc":         cc_raw,
+            "bcc":        bcc_raw,
+            "display_to": display_to,
+            "display_cc": display_cc,
+            "recipients": recipients,
+
+            "body_text": body_text,
+            "body_html": body_html,
+
+            "attachments": attachments,
+            "headers":     headers,
+            "mapi":        mapi_raw,
+
+            "parsed_at": datetime.now(timezone.utc).replace(tzinfo=None),
+        }
+
+    except Exception as e:
+        logging.error("extract_message failed [%s]: %s", msg_path.name, e)
+        return None
+
+
+# ─── MongoDB indexy ───────────────────────────────────────────────────────────
+
+def create_indexes(col):
+    print("  Vytvarim indexy...")
+    col.create_index([("received_at",        ASCENDING)])
+    col.create_index([("sent_at",            ASCENDING)])
+    col.create_index([("sender.email",       ASCENDING)])
+    col.create_index([("filename",           ASCENDING)], unique=True, sparse=True)
+    col.create_index([("conversation_topic", ASCENDING)])
+    col.create_index([("has_attachments",    ASCENDING)])
+    col.create_index([("categories",         ASCENDING)])
+    col.create_index([("importance",         ASCENDING)])
+    col.create_index([("flag_status",        ASCENDING)])
+    col.create_index([
+        ("subject",   TEXT),
+        ("body_text", TEXT),
+        ("to",        TEXT),
+        ("cc",        TEXT),
+    ], name="text_search", default_language="none")
+    print("  Indexy hotovy.")
+
+
+# ─── MAIN ─────────────────────────────────────────────────────────────────────
+
+def main():
+    ap = argparse.ArgumentParser(description=f"parse_emails v{SCRIPT_VERSION}")
+    ap.add_argument("--msgs-dir",      default=str(MSGS_DIR),
+                    help="Cesta k .msg souborum")
+    ap.add_argument("--limit",         type=int, default=0,
+                    help="Zpracovat max N souboru (0 = vse)")
+    ap.add_argument("--skip-existing", action="store_true",
+                    help="Preskocit soubory ktere jiz jsou v MongoDB (pokracovani)")
+    ap.add_argument("--no-indexes",    action="store_true",
+                    help="Nevytvorit indexy na konci")
+    args = ap.parse_args()
+
+    msgs_dir = Path(args.msgs_dir)
+    start    = datetime.now()
+
+    print(f"=== parse_emails v{SCRIPT_VERSION} ===")
+    print(f"Start:   {start.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"Zdroj:   {msgs_dir}")
+    print(f"MongoDB: {MONGO_URI} -> {MONGO_DB}.{MONGO_COL}")
+
+    # MongoDB
+    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=5000)
+    try:
+        client.admin.command("ping")
+        print("  MongoDB OK")
+    except Exception as e:
+        print(f"  CHYBA: MongoDB neni dostupna -- {e}")
+        sys.exit(1)
+
+    col = client[MONGO_DB][MONGO_COL]
+
+    # Skip existing — nacti seznam uz importovanych souboru
+    existing: set = set()
+    if args.skip_existing:
+        print("  Nacitam existujici zaznamy z MongoDB...")
+        existing = set(col.distinct("filename"))
+        print(f"  {len(existing)} jiz importovano")
+
+    # Scan
+    print(f"\nSkenuji {msgs_dir} ...")
+    all_files = sorted(msgs_dir.glob("*.msg"))
+    if args.limit:
+        all_files = all_files[:args.limit]
+
+    to_process = [f for f in all_files if f.name not in existing]
+    skipped    = len(all_files) - len(to_process)
+    total      = len(to_process)
+
+    print(f"  Celkem .msg:    {len(all_files)}")
+    print(f"  Preskoceno:     {skipped}")
+    print(f"  Ke zpracovani:  {total}\n")
+
+    if total == 0:
+        print("Neni co importovat.")
+        client.close()
+        return
+
+    batch     = []
+    ok_count  = 0
+    err_count = 0
+
+    def flush():
+        if not batch:
+            return
+        try:
+            col.bulk_write(batch, ordered=False)
+        except Exception as e:
+            logging.error("bulk_write: %s", e)
+            print(f"  CHYBA bulk_write: {e}")
+        batch.clear()
+
+    for i, msg_path in enumerate(to_process, 1):
+        doc = extract_message(msg_path)
+
+        if doc is None:
+            err_count += 1
+        else:
+            batch.append(UpdateOne({"_id": doc["_id"]}, {"$set": doc}, upsert=True))
+            ok_count += 1
+
+        if len(batch) >= BATCH_SIZE:
+            flush()
+
+        # Výpis každého emailu
+        status = "ERR " if doc is None else "OK  "
+        subject_str = (doc.get("subject") or "")[:60] if doc else "?"
+        sender_str  = (doc.get("sender", {}).get("email") or "")[:40] if doc else "?"
+        print(f"  {i:>6}/{total}  {status}  {subject_str:<60}  {sender_str}")
+
+        if i % 500 == 0:
+            elapsed = (datetime.now() - start).total_seconds()
+            rate    = i / elapsed if elapsed > 0 else 0
+            eta_s   = int((total - i) / rate) if rate > 0 else 0
+            print(f"  {'─'*80}")
+            print(f"  Průběh: ok={ok_count}  err={err_count}  "
+                  f"{rate:.1f} msg/s  ETA {eta_s//3600}h{(eta_s%3600)//60}m")
+            print(f"  {'─'*80}")
+
+    flush()
+
+    elapsed_total = (datetime.now() - start).total_seconds()
+    print(f"\n{'='*52}")
+    print(f"Vysledek:  ok={ok_count}  |  skip={skipped}  |  err={err_count}")
+    print(f"Celkovy cas: {int(elapsed_total//3600)}h {int((elapsed_total%3600)//60)}m {int(elapsed_total%60)}s")
+    print(f"Dokumentu v kolekci: {col.count_documents({})}")
+
+    if not args.no_indexes:
+        print()
+        create_indexes(col)
+
+    print(f"\nKonec: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    if err_count:
+        print(f"Chyby logovany do: {LOG_FILE}")
+
+    client.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,122 @@
+# python-runner — Docker kontejner na Tower
+
+## Základní info
+
+| Parametr       | Hodnota                                      |
+|----------------|----------------------------------------------|
+| Název          | python-runner                                |
+| Image          | python-runner (vlastní)                      |
+| Status         | running (unless-stopped)                     |
+| Python         | 3.12.13                                      |
+| Spouštěcí cmd  | `tail -f /dev/null` — container jen běží, skripty se spouštějí ručně |
+| Working dir    | `/scripts`                                   |
+| Vytvořen       | 2026-06-02                                   |
+
+---
+
+## Tower — SSH přístup
+
+| Parametr | Hodnota          |
+|----------|------------------|
+| Host     | tower / 192.168.1.76 |
+| Port     | 22               |
+| User     | root             |
+| Heslo    | 7309208104       |
+
+**Připojení přes Python (paramiko)** — Docker CLI není lokálně dostupný:
+
+```python
+import paramiko
+c = paramiko.SSHClient()
+c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
+c.connect('192.168.1.76', username='root', password='7309208104')
+_, out, _ = c.exec_command('...')
+print(out.read().decode())
+c.close()
+```
+
+---
+
+## Volume mounty
+
+| Host (Unraid)         | Kontejner         | Popis                        |
+|-----------------------|-------------------|------------------------------|
+| `/mnt/user/Scripts`   | `/scripts`        | Skripty, logy — working dir  |
+| `/mnt/user/JNJEMAILS` | `/mnt/JNJEMAILS`  | .msg soubory emailů (JNJ)    |
+
+---
+
+## Spouštění skriptů
+
+```bash
+# Interaktivně (vidíš výstup):
+docker exec -it python-runner python /scripts/parse_emails_tower_v1.1.py --limit 50 --no-indexes
+
+# Na pozadí (log do souboru):
+docker exec -d python-runner bash -c \
+  "python /scripts/parse_emails_tower_v1.1.py > /scripts/parse_emails.log 2>&1"
+
+# Pokračování po přerušení (skip hotových):
+docker exec -d python-runner bash -c \
+  "python /scripts/parse_emails_tower_v1.1.py --skip-existing > /scripts/parse_emails.log 2>&1"
+
+# Sledování průběhu:
+docker exec -it python-runner tail -f /scripts/parse_emails.log
+```
+
+---
+
+## Aktuální skripty v /scripts
+
+| Soubor                        | Popis                                          |
+|-------------------------------|------------------------------------------------|
+| `parse_emails_tower_v1.1.py`  | Import .msg → MongoDB (db: emaily, kolekce: vbuzalka@its.jnj.com) |
+| `parse_emails_tower_v1.1.md`  | Dokumentace ke skriptu                         |
+| `parse_emails.log`            | Log průběhu importu                            |
+| `parse_emails_errors.log`     | Log chyb (soubory které selhaly)               |
+
+Lokální protějšek: `EmailsImport/parse_emails_v1.0.py` — identický kód, liší se jen cestou
+(`\\tower\JNJEMAILS` SMB vs. `/mnt/JNJEMAILS` lokální mount) a verzí hlavičky.
+
+---
+
+## Nainstalované Python balíčky
+
+```
+extract-msg        0.55.0
+pymongo            4.17.0
+python-dateutil    2.9.0.post0
+cryptography       48.0.0
+beautifulsoup4     4.13.5
+oletools           0.60.2
+msoffcrypto-tool   6.0.0
+olefile            0.47
+RTFDE              0.1.2.2
+compressed-rtf     1.0.7
+lark               1.3.1
+pcodedmp           1.2.6
+tzlocal            5.3.1
+six                1.17.0
+pip                25.0.1
+```
+
+---
+
+## Přidání nového balíčku
+
+```bash
+docker exec python-runner pip install <balicek>
+```
+
+> Pozor: instalace se ztratí při recreate kontejneru — je třeba přidat do Dockerfile nebo do setup skriptu.
+
+---
+
+## Logika parse_emails (oba skripty)
+
+- Čte všechny `.msg` soubory z MSGS_DIR
+- Extrahuje: předmět, odesílatel, příjemci (To/CC/BCC), tělo (text+HTML), přílohy, internet headers, všechny raw MAPI properties
+- Ukládá do MongoDB: `emaily` → `vbuzalka@its.jnj.com`
+- `_id` = Internet Message-ID (nebo `filename:<stem>` jako fallback)
+- Upsert → bezpečné opakování, `--skip-existing` pro pokračování
+- Indexy: received_at, sent_at, sender.email, filename (unique), full-text (subject+body+to+cc)