Run Submission Guidelines (DIALOG Task)

Input Files:

  • Each participant must translate eight input files, the automatic speech recognition results (ASR) and the correct recognition results (CRR) of the English and Chinese utterances of the IWSLT 2009and IWSLT 2010testset dialogs. The ASR output files consist of either word lattices (SLF), lists of multiple recognition hypotheses (NBEST), or single-best recognition hypotheses (1BEST). The participants are free to choose any of the three ASR output data types as the input of their MT system. The CRR files are the human transcriptions of the Dialog Task data files that do not include recognition errors.

    • Chinese:

      • [ASR]

        (1) DIALOG/Chinese-English/test/1BEST/IWSLT09_DIALOG.testset.zh.1BEST.txt
          orDIALOG/Chinese-English/test/NBEST/IWSLT09_DIALOG.testset.zh.20BEST.txt
          orDIALOG/Chinese-English/test/SLF/IWSLT09_DIALOG.testset/*.SLF
        (2) DIALOG/Chinese-English/test/1BEST/IWSLT10_DIALOG.testset.zh.1BEST.txt
          orDIALOG/Chinese-English/test/NBEST/IWSLT10_DIALOG.testset.zh.20BEST.txt
          orDIALOG/Chinese-English/test/SLF/IWSLT10_DIALOG.testset/*.SLF
      • [CRR]

        (3) DIALOG/Chinese-English/test/TXT/IWSLT09_DIALOG.testset.zh.txt
        (4) DIALOG/Chinese-English/test/TXT/IWSLT10_DIALOG.testset.zh.txt
    • English:

      • [ASR]

        (5) DIALOG/English-Chinese/test/1BEST/IWSLT09_DIALOG.testset.en.1BEST.txt
          orDIALOG/English-Chinese/test/NBEST/IWSLT09_DIALOG.testset.en.20BEST.txt
          orDIALOG/English-Chinese/test/SLF/IWSLT09_DIALOG.testset/*.SLF
        (6) DIALOG/English-Chinese/test/1BEST/IWSLT10_DIALOG.testset.en.1BEST.txt
          orDIALOG/English-Chinese/test/NBEST/IWSLT10_DIALOG.testset.en.20BEST.txt
          orDIALOG/English-Chinese/test/SLF/IWSLT10_DIALOG.testset/*.SLF
      • [CRR]

        (7) DIALOG/English-Chinese/test/TXT/IWSLT09_DIALOG.testset.en.txt
        (8) DIALOG/English-Chinese/test/TXT/IWSLT10_DIALOG.testset.en.txt

Data Format:

  • The same formatas the DIALOG Develop Corpus. For details, refer to the respective README files:

    • DIALOG/Chinese-English/README.DIALOG_CE.txt
    • DIALOG/Chinese-English/README.DIALOG_EC.txt
  • The input data sets are created from the speech recognition results (ASR output) and therefore are CASE-INSENSITIVE and do NOT contain punctuation
  • The dialog structure is reflected in the respective sentence ID.

    • Example:

      • (dialog structure)
        IWSLT10_DIALOG.testset_dialog01_01\01\...1st English utterance...
        IWSLT10_DIALOG.testset_dialog01_02\01\...1st Chinese utterance...
        IWSLT10_DIALOG.testset_dialog01_03\01\...2nd English utterance...
        IWSLT10_DIALOG.testset_dialog01_04\01\...2nd Chinese utterance...
        IWSLT10_DIALOG.testset_dialog01_05\01\...3rd Chinese utterance...
        IWSLT10_DIALOG.testset_dialog01_06\01\...3rd English utterance...
        ...
      • (English input data to be translated into Chinese)
        IWSLT10_DIALOG.testset_dialog01_01\01\...1st English utterance...
        IWSLT10_DIALOG.testset_dialog01_03\01\...2nd English utterance...
        IWSLT10_DIALOG.testset_dialog01_06\01\...3rd English utterance...
        ...
      • (Chinese input data to be translated into English)
        IWSLT10_DIALOG.testset_dialog01_02\01\...1st Chinese utterance...
        IWSLT10_DIALOG.testset_dialog01_04\01\...2nd Chinese utterance...
        IWSLT10_DIALOG.testset_dialog01_05\01\...3rd Chinese utterance...
        ...
  • Chinese MT output should:

    • be in the same format (<SentenceID>\01\MT_output_text) as the English CRR input file
    • be case-sensitive, with punctuation
    • contain the same amount of lines (=sentences) as the English CRR input file

      • Example:

        IWSLT10_DIALOG.testset_dialog01_01\01\...Chinese translation of 1st English utterance...
        IWSLT10_DIALOG.testset_dialog01_03\01\...Chinese translation of 2nd English utterance...
        IWSLT10_DIALOG.testset_dialog01_06\01\...Chinese translation of 3rd English utterance...
        ...
        
  • English MT output should:

    • be in the same format (<SentenceID>\01\MT_output_text) as the Chinese CRR input file
    • be case-sensitive, with punctuation
    • contain the same amount of lines (=sentences) as the Chinese CRR input file

      • Example:

        IWSLT10_DIALOG.testset_dialog01_02\01\...English translation of 1st Chinese utterance...
        IWSLT10_DIALOG.testset_dialog01_04\01\...English translation of 2nd Chinese utterance...
        IWSLT10_DIALOG.testset_dialog01_05\01\...English translation of 3rd Chinese utterance...
        ...
        

Run Submission Format:

  • Each participant registered for the DIALOG Task must translate both translation input conditions (ASR, CRR) for both testsets (IWSLT09, IWSLT10) for each translation direction (English→Chinese, Chinese→English) and submit a total of 8 MT output files per run. Run submissions containing only partial results will be ignored for the IWSLT 2010 subjective evaluation.
  • Multiple run submissions are allowed, but participants must explicitly indicate one PRIMARY run that will be used for human assessments. All other run submissions are treated as CONTRASTIVE runs. If none of the runs are marked as PRIMARY, the latest submission (according to the file time-stamp) will be used as the primary run submission.
  • Runs must be submitted as a gzipped TAR archive (format see below) and send as an email attachement to Michael Paul (michael DOT paul AT nict DOT go DOT jp).

     

    TAR archive file structure:

    <UserID>/<TestSet>_<TranslationTask>.<UserID>.primary.CRR.txt
    <UserID>/<TestSet>_<TranslationTask>.<UserID>.primary.ASR.<CONDITION>.txt
            /<TestSet>_<TranslationTask>.<UserID>.contrastive1.CRR.txt
            /<TestSet>_<TranslationTask>.<UserID>.contrastive1.ASR.<CONDITION>.txt
            /<TestSet>_<TranslationTask>.<UserID>.contrastive2.CRR.txt
            /<TestSet>_<TranslationTask>.<UserID>.contrastive2.ASR.<CONDITION>.txt
            /...

    where:

    <UserID> = user ID of participant used to download data files
    <TestSet> = IWSLT09 | IWSLT10
    <TranslationTask> = DIALOG.CE | DIALOG.EC
    <CONDITION> = SLF | <NUM>
    <NUM> = number of recognition hypotheses used for translation

    Examples:

        nict/IWSLT09_DIALOG.CE.nict.primary.CRR.txt
            /IWSLT09_DIALOG.CE.nict.primary.ASR.SLF.txt
            /IWSLT09_DIALOG.EC.nict.primary.CRR.txt
            /IWSLT09_DIALOG.EC.nict.primary.ASR.20.txt
            /IWSLT10_DIALOG.CE.nict.primary.CRR.txt
            /IWSLT10_DIALOG.CE.nict.primary.ASR.SLF.txt
            /IWSLT10_DIALOG.EC.nict.primary.CRR.txt
            /IWSLT10_DIALOG.EC.nict.primary.ASR.1.txt
    
            /IWSLT09_DIALOG.CE.nict.contrastive1.CRR.txt
            /IWSLT09_DIALOG.CE.nict.contrastive1.ASR.1.txt
            /IWSLT09_DIALOG.EC.nict.contrastive1.CRR.txt
            /IWSLT09_DIALOG.EC.nict.contrastive1.ASR.1.txt
            /IWSLT10_DIALOG.CE.nict.contrastive1.CRR.txt
            /IWSLT10_DIALOG.CE.nict.contrastive1.ASR.1.txt
            /IWSLT10_DIALOG.EC.nict.contrastive1.CRR.txt
            /IWSLT10_DIALOG.EC.nict.contrastive1.ASR.1.txt
  • Re-submitting your runs is allowed as long as the mails arrive BEFORE the submission deadline. If multiple TAR archives are submitted by the same participant, only the runs from the most recent submission mail will be used for the IWSLT 2010 evaluation, and those from previous mails will be ignored.