Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,2 @@
.DS_Store
test
materials
vendor/
composer.lock
80 changes: 0 additions & 80 deletions CHANGELOG.md

This file was deleted.

21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2017 Martins Pilsetnieks, Evgen Kinonin

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
28 changes: 0 additions & 28 deletions LICENSE.md

This file was deleted.

166 changes: 82 additions & 84 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,85 +1,83 @@
**spreadsheet-reader** is a PHP spreadsheet reader that differs from others in that the main goal for it was efficient
data extraction that could handle large (as in really large) files. So far it may not definitely be CPU, time
or I/O-efficient but at least it won't run out of memory (except maybe for XLS files).

So far XLSX, ODS and text/CSV file parsing should be memory-efficient. XLS file parsing is done with php-excel-reader
from http://code.google.com/p/php-excel-reader/ which, sadly, has memory issues with bigger spreadsheets, as it reads the
data all at once and keeps it all in memory.

### Requirements:
* PHP 5.3.0 or newer
* PHP must have Zip file support (see http://php.net/manual/en/zip.installation.php)

### Usage:

All data is read from the file sequentially, with each row being returned as a numeric array.
This is about the easiest way to read a file:

<?php
// If you need to parse XLS files, include php-excel-reader
require('php-excel-reader/excel_reader2.php');

require('SpreadsheetReader.php');

$Reader = new SpreadsheetReader('example.xlsx');
foreach ($Reader as $Row)
{
print_r($Row);
}
?>

However, now also multiple sheet reading is supported for file formats where it is possible. (In case of CSV, it is handled as if
it only has one sheet.)

You can retrieve information about sheets contained in the file by calling the `Sheets()` method which returns an array with
sheet indexes as keys and sheet names as values. Then you can change the sheet that's currently being read by passing that index
to the `ChangeSheet($Index)` method.

Example:

<?php
$Reader = new SpreadsheetReader('example.xlsx');
$Sheets = $Reader -> Sheets();

foreach ($Sheets as $Index => $Name)
{
echo 'Sheet #'.$Index.': '.$Name;

$Reader -> ChangeSheet($Index);

foreach ($Reader as $Row)
{
print_r($Row);
}
}
?>

If a sheet is changed to the same that is currently open, the position in the file still reverts to the beginning, so as to conform
to the same behavior as when changed to a different sheet.

### Testing

From the command line:

php test.php path-to-spreadsheet.xls

In the browser:

http://path-to-library/test.php?File=/path/to/spreadsheet.xls

### Notes about library performance
* CSV and text files are read strictly sequentially so performance should be O(n);
* When parsing XLS files, all of the file content is read into memory so large XLS files can lead to "out of memory" errors;
* XLSX files use so called "shared strings" internally to optimize for cases where the same string is repeated multiple times.
Internally XLSX is an XML text that is parsed sequentially to extract data from it, however, in some cases these shared strings are a problem -
sometimes Excel may put all, or nearly all of the strings from the spreadsheet in the shared string file (which is a separate XML text), and not necessarily in the same
order. Worst case scenario is when it is in reverse order - for each string we need to parse the shared string XML from the beginning, if we want to avoid keeping the data in memory.
To that end, the XLSX parser has a cache for shared strings that is used if the total shared string count is not too high. In case you get out of memory errors, you can
try adjusting the *SHARED_STRING_CACHE_LIMIT* constant in SpreadsheetReader_XLSX to a lower one.

### TODOs:
* ODS date formats;

### Licensing
All of the code in this library is licensed under the MIT license as included in the LICENSE file, however, for now the library
# Установка - Install Composer

[![Latest Stable Version](https://poser.pugx.org/jackmartin/readerexcel/v/stable)](https://packagist.org/packages/jackmartin/readerexcel) [![Total Downloads](https://poser.pugx.org/jackmartin/readerexcel/downloads)](https://packagist.org/packages/jackmartin/readerexcel) [![License](https://poser.pugx.org/jackmartin/readerexcel/license)](https://packagist.org/packages/jackmartin/readerexcel)

```

composer require jackmartin/readerexcel

```

# Описание - Description

**spreadsheet-reader** is a PHP spreadsheet reader that differs from others in that the main goal for it was efficient
data extraction that could handle large (as in really large) files. So far it may not definitely be CPU, time
or I/O-efficient but at least it won't run out of memory (except maybe for XLS files).

# Требуется - Requirements:
* PHP 5.3.0 or newer
* PHP must have Zip file support (see http://php.net/manual/en/zip.installation.php)
* Composer

# Использовать - Usage:


However, now also multiple sheet reading is supported for file formats where it is possible. (In case of CSV, it is handled as if
it only has one sheet.)

You can retrieve information about sheets contained in the file by calling the `Sheets()` method which returns an array with
sheet indexes as keys and sheet names as values. Then you can change the sheet that's currently being read by passing that index
to the `ChangeSheet($Index)` method.

### Пример - Example: ###

```

php reader.php test.xlsx

```

```
include_once __DIR__ . '/vendor/autoload.php';

use ReaderExcel\SpreadsheetReader;

$file = __DIR__ . '/test.xlsx';

$reader = new SpreadsheetReader($file);

$Sheets = $reader->Sheets();

foreach ($Sheets as $Index => $Name) {

$reader->ChangeSheet($Index);

foreach ($reader as $Key => $Row) {

print_r($Row);

}

}

```

If a sheet is changed to the same that is currently open, the position in the file still reverts to the beginning, so as to conform
to the same behavior as when changed to a different sheet.

# Notes about library performance
* CSV and text files are read strictly sequentially so performance should be O(n);
* When parsing XLS files, all of the file content is read into memory so large XLS files can lead to "out of memory" errors;
* XLSX files use so called "shared strings" internally to optimize for cases where the same string is repeated multiple times.
Internally XLSX is an XML text that is parsed sequentially to extract data from it, however, in some cases these shared strings are a problem -
sometimes Excel may put all, or nearly all of the strings from the spreadsheet in the shared string file (which is a separate XML text), and not necessarily in the same
order. Worst case scenario is when it is in reverse order - for each string we need to parse the shared string XML from the beginning, if we want to avoid keeping the data in memory.
To that end, the XLSX parser has a cache for shared strings that is used if the total shared string count is not too high. In case you get out of memory errors, you can
try adjusting the *SHARED_STRING_CACHE_LIMIT* constant in SpreadsheetReader_XLSX to a lower one.

# TODOs:
* ODS date formats;

# Licensing
All of the code in this library is licensed under the MIT license as included in the LICENSE file, however, for now the library
relies on php-excel-reader library for XLS file parsing which is licensed under the PHP license.
Loading