Metadata-Version: 2.1
Name: LindexURI
Version: 0.1.2
Summary: Hadoop URI utility
Home-page: https://github.com/ozw1z5rd/LindexURI.git
Author: Alessio Palma
Author-email: ozw1z5rd@gmail.com
License: UNKNOWN
Description: # LindexURI
        BigData URI utility
        
        The idea is that every information stored is addressable using a URI, in this case we like to work with HDFS and HIVE. There
        
        LindexURI.isValid( uri ) : returns true if the URI is valid, it's a static method and can be used in a quick way.
        
        ### luri = LindexURI(uri) 
        ### luri.isPartitioned()
        returns true if the HIVE uri is defining a partitioned table
        
        if uri == "hive://databasename/tablename?dt=201212" luri.isPartitioned returns True.
        
        ### luri.getPartitions() 
        returns a dictionary that describes the HIVE partition
        
        if uri == "hive://databasename/tablename?dt=201212" luri.getPartitions() returns
        
        OrderedDict( 'dt': '201212' ) 
        
        ### luri.getDatabase()
        gets the database name from the HIVE uri ( this can be modified to work also with HDFS paths )
        
        if uri == "hive://databasename/tablename?dt=201212" luri.getDatabase() returns 'databasename'
        
        ### luri.getTable()
        gets the table name from HIVE uri, can be modified to work also with HDFS paths
        
        if uri == "hive://databasename/tablename?dt=201212" luri.getDatabase() returns 'tablename'
        
        ### luri.getHDFSHostName()
        gets the HDFS hostname 
        
        if uri == "hdfs://hdfs-prod/warehouse/databasename.db/tablename.db/dt=201212" luri.getHDFSHostName returns 'hdfs-prod'
        
        ### luri.getHDFSPath()
        gets the path from the HDFS uri 
        
        if uri == "hdfs://hdfs-prod/warehouse/databasename.db/tablename.db/dt=201212" luri.getHDFSPath() returns 'warehouse/databasename.db/tablename.db/dt=201212'
        
        ### luri.getSchema()
        gets the schema 
        
        if uri == "hdfs://hdfs-prod/warehouse/databasename.db/tablename.db/dt=201212" luri.getSchema() returns 'hdfs'
        
        ### luri.getPartitionsAsHDFSPath()
        converts the partition coordinates into an HDFS path
        
        p = OrderedDict( 'dt' : '201212', 'country': 'AU' ) 
        dt=201212&country=AU
        
        ### luri.getHDFSPathAsPartition()
        converts the HDFS path into a partition coordinates dictionary
        
               'hdfs://hdfs-production/Vault/Docomodigital/Production/Newton/events/prod/year=2018/month=08/day=07/hour=09'
        
                root path : "/Vault/Docomodigital/Production/Newton/events/prod/"
        
                partitions : {
                    "year" : "2018",
                    "month" : "08",
                    "day" : "07",
                    "hour" : "09"
                }
                
        
        
        ### luri.looksPartitioned()
        returns true if the HDFS path can define a partition
        
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 2
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
